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PREFACE 


This  report  to  the  Joint  Services  Technical  Advisory  Committee  presents  a 
summary  of  the  research  programs  in  the  broad  field  of  electronics  conducted  during 
the  past  year  at  the  Polytechnic  Institute  of  New  York.  These  programs  are  pursued 
within  the  framework  of  the  Microwave  Research  Institute^  and  they  involve  the  aca- 
demic research  activities  of  faculty  in  the  departments  of  Electrical  Engineering  and 
Electrophysics,  Physics,  and  Chemistry.  The  research  projects  cover  a broad  spec- 
trum ranging  from  basic  theoretical  investigations  in  physics,  applied  mathematics, 
and  engineering  to  experimental  efforts  involving  basic  measurements  and  the  develop- 
ment of  devices  and  materials. 

The  format  of  this  annual  report  permits  a coherent  presentation  of  the  various 
phases  of  the  Joint  Services  Electronics  Program  (JSEP)  at  the  Polytechnic  and  their 
relation  to  ongoing  research  in  electronics  sponsored  by  other  agencies.  This  presen- 
tation is  intended  for  the  information  of  the  Air  Force  Office  of  Scientific  Research,  the 
Army  Research  Office  and  the  Office  of  Naval  Research  and,  in  addition,  the  other 
sponsors  who  are  individually  acknowledged  throughout  the  report.  The  principal  aims 
of  the  JSEP  are  to  initiate  deserving  lines  of  research  in  a timely  fashion  and  to  develop 
investigations  to  a stature  sufficient  to  attract  individual  support  on  their  own  merits. 

In  the  early  days  of  the  Microwave  Research  Institute,  the  research  program 
consisted  primarily  of  projects  involving  electromagnetics  and  microwave  components. 
Although  *he  name  of  the  Institute  has  remained  the  same,  the  nature  of  the  research 
programs  has  broadened  substantially,  and  the  programs  now  encompass  a wide  range 
of  topics  within  the  field  of  electronics.  The  current  programs  are  organized  into 
twelve  areas:  electromagnetics;  acoustics;  optics;  quantum  electronics;  solid  state  and 
materials;  wave-matter  interactions;  electric  power  engineering;  communications; 
computers  and  computer -communication  networks;  safety,  reliability  and  software 
engineering;  systems,  control  and  networks;  and  data  processing.  A short  description 
of  the  nature  of  these  programs  is  presented  in  the  Introduction,  on  pages  xi  through 
xxxii. 
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THE  MICROWAVE  RESEARCH  INSTITUTE  PROGRAMS 

of  the 

POLYTECHNIC  INSTITUTE  OF  NEW  YORK 

(1976-1977) 


' This  introductory  section  contains  an  over-all  summary  of  the  research 

programs  in  the  broad  field  of  electronics  conducted  at  the  Polytechnic  Institute  of 
New  York  during  the  past  year.  These  programs  are  organized  below  under  the  fol- 
lowing descriptive  subject  headings: 

I ELECTROPHYSICS:  Electromagnetics;  Acoustics;  Optics;  Quantum  Elec- 

I tronics;  Solid  State  and  Materials;  Wave-Matter  Interactions;  and 

Electric  Power  Engineering. 

SYSTEMS:  Communications;  Computers  and  Computer-Communication 
Networks;  Safety,  Reliability  and  Software  Engineering;  Systems, 

Control  and  Networks;  and  Data  Processing. 

i 

I.  ELECTROPHYSICS 

A.  ELECTROMAGNETICS 
Program  Director:  A.  Hessel 

The  various  investigations  which  fall  under  the  broad  heading  of  electromag- 
netics involve  the  propagation,  guiding,  radiation,  and  diffraction  of  electromagnetic 
i waves  in  a large  variety  of  environments.  Included  under  this  category  during  the  past 

few  years  are  major  programs  in  wave  types  near  interfaces,  radiation  from  and 
scattering  from  periodic  structures,  various  antenna  investigations,  particularly  those 
involving  planar  and  conformal  phased  arrays,  radiation  from  sources  and  scattering  by 
obstacles  in  media  with  relatively  arbitrary  properties,  a systematic  exploitation  of 
ray-optical  techniques  in  propagation,  guiding,  scattering  and  antenna  problems,  and 
topics  related  to  biological  hazards  at  microwave  frequencies. 

The  program  on  wave  types  near  interfaces  some  years  ago  introduced  the 
concept  of  leaky  waves,  showed  its  value  in  the  explanation  of  many  radiation  phenomena, 
and  laid  the  foundations  for  a new  class  of  traveling-wave  antennas,  the  leaky-wave 
antenna,  which  permitted  better  agreement  between  theoretical  and  measured  radiation 
patterns  than  any  other  type  of  antenna.  The  program  led  to  a very  general  study  of 
wave  types,  including  several  categories  of  complex  guided  wave  and  lateral  wave,  and 
showed  their  interrelations.  The  lateral  wave  was  also  shown  to  be  the  mechanism 
which  permits  most  of  the  point-to-point  communication  in  a jungle  or  forest  environment. 

A current  investigation  which  relates  to  wave  types  near  interfaces  involves 
the  lateral  beam  shift  encountered  by  a beam  of  finite  width  when  incident  upon  an  inter- 
face under  appropriate  conditions.  If  the  beam  is  incident  upon  the  interface  between 
two  dielectric  half-spaces  at  or  very  near  to  the  critical  angle  of  total  reflection,  the 
resulting  beam  shift  is  called  the  Goos-HSnchen  shift.  This  shift  is  a special  case  of 
a larger  class  which  is  currently  under  systematic  investigation.  At  the  critical  angle 
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of  total  reflection,  the  incident  beam  couples  to  a lateral  wave  which  propagates  along 

the  interface.  If  a layer  is  present  on  the  interface,  or  if  a periodic  grating  is  located  | 

there,  and  if  the  beam  is  incident  at  the  angle  corresponding  to  a leaky  wave  on  the  ^ 

layer  or  grating,  a very  pronounced  lateral  beam  shift  is  found.  This  beam  shift,  which  i 

is  related  to  a pole  in  the  wavenumber  spectral  plane,  is  much  larger  in  general  than  ! 

the  beam  shift  due  to  the  lateral  wave,  which  corresponds  only  to  a branch  point.  This  i 

beam-shift  mechanism  has  recently  been  used  to  achieve  high  coupling  efficiencies  at 

optical  frequencies  between  laser  beams  and  thin  films.  The  beam  shift  itself  offers 

a simple  condition  for  optimizing  the  coupling  parameters.  The  general  properties  of 

such  beam  shifts  and  their  application  to  optical  coupling  devices  are  under  systematic 

study. 

The  lateral  beam  shift  is  also  required  for  a correct  ray-optical  model  for  H 

modal  propagation  in  thin  films  and  optical  fibers.  Such  modal  propagation  studies,  I i 

which  also  involve  the  excitation  of  guiding  structures  of  this  type  by  Gaussian  beams,  ! : 

and  the  optical  coupling  studies  mentioned  above,  are  considered  in  more  detail  in  the  j 

Optics  section.  The  leaky-wave  beam  shift  is  also  shown  to  permit  the  optimization  of 
the  coupling  efficiency  of  acoustic  wedge  transducers  and  couplers  from  acoustic  sur-  • 

face  waves  to  linear  acoustic  waveguides;  these  studies  are  reported  in  the  Acoustics 
section. 

The  program  on  periodic  structures  has  many  ramifications.  One  phase  re-  ; ’ 

lates  to  general  studies  of  radiation  from  surfaces  or  interfaces  which  are  modulated 
periodically,  and  in  this  context  the  Brillouin  diagram  for  radiating  periodic  structures  i 

was  first  introduced  and  its  usefulness  clarified.  This  phase  also  included  studies  of  ' 

various  types  of  periodically-modulated  slow-wave  antennas.  Another  phase  involved  ] 

the  study  of  special  symmetries  in  periodic  structures,  particularly  screw  and  glide 
symmetry.  Studies  of  scattering  by  periodic  surfaces  led  to  a new  and  physically-  j 

satisfying  theory  of  Wood's  anomalies  on  optical  gratings,  an  effect  which  remained 
poorly  understood  for  60  years.  This  theory  shed  substantial  light  on  scattering  reso- 
nances in  general,  and  these  ideas  have  been  subsequently  applied  to  other  areas,  such  , ’ 

as  scattering  from  plasmas  and  radiation  from  phased-array  antennas.  These  studies  . ] 

are  now  being  extended  to  the  investigation  of  effects  due  to  the  boundedness  of  finite  1 j 

beams.  { j 

k j 

Another  study  relating  to  periodic  structures  led  to  a new  way  of  treating  j ] 

mutual-coupling  effects  in  phased  arrays  which  takes  these  effects  into  account  rigor- 
ously and  automatically,  and  it  proposed  the  first  compensation  scheme  for  the  minimi- 
zation of  such  effects.  A recent  study  extended  this  technique  to  the  explanation  of 
resonance  effects  in  large  phased  arrays,  a phenomenon  that  can  cause  the  array  to 
become  unexpectedly  "blind"  at  certain  scan  angles.  This  theoretical  and  experimental 
study  achieved  a rather  complete  understanding  of  this  phenomenon- -its  causes  and  how 
it  may  be  avoided.  In  addition,  a guided-wave  approach  has  been  applied  specifically 
to  the  analysis  of  a slot-fed  phased  array  on  a conducting  circular  cylinder.  This 
analysis  is  the  first  one  for  such  a structure  in  which  mutual-coupling  effects  have  been 
completely  taken  into  account. 

A concentrated  effort  is  under  way  to  establish  methods  of  analysis  for 
mutually-coupled  arrays  of  aperture  elements  on  non-planar  conducting  convex  and 
concave  surfaces,  using  both  modal  and  asymptotic  (ray)  methods.  Non-planar  arrays 
of  this  type  are  called  "conformal  arrays"  because  they  are  often  located  on  the  skins 
of  aircraft,  where  they  must  "conform"  to  the  shape  of  the  structure  (wing,  fuselage, 
etc.  ) on  which  they  are  located.  The  motivation  for  this  effort  is  twofold:  a)  for 
application  to  a recent  interesting  development  in  antenna  arrays  called  the  "Dome 
Antenna",  which  is  a feed-array  combination  of  a single  planar  phased  array  with  a 


xii 


THE  MICROWAVE  RESEARCH  INSTITUTE  PROGRAMS 


passive,  dome-shaped,  feed-through  lens,  and  b)  for  the  design  of  flush-mounted  slot 
arrays  on  conical  or  ogive  surfaces  for  homing  missile  applications. 

While  the  design  of  conformal  arrays  generally  requires  the  knowledge  of  the 
coupling  coefficients  and  element  patterns  in  a mutually-coupled  convex  environment, 
a conformal  feed-through  scanning  lens  array  design  necessitates,  in  addition,  the 
knowledge  of  mutual  coupling  for  arrays  on  concave  conducting  surfaces.  The  following 
topics  are  currently  being  pursued:  (1)  the  analysis  of  dually- polarized  open-ended 
circular  waveguide  elements,  and  (2)  the  coupling  coefficients  in  cylindrical  arrays  of 
waveguide-fed  axial  slits. 

Analysis  shows  that  the  decay  of  the  coupling  coefficients  in  arrays  on  concave 
surfaces  may  differ  radically  from  that  on  planar  arrays.  The  decay  rate  may  be  con- 
siderably slower  (even  for  ka'~100,  where  a is  the  local  equivalent  cylindrical  radius). 
This  feature  indicates  that  the  use  of  a planar  array  matching  network  may  not  be 
adequate,  and  that  the  size  of  the  test  array  used  for  the  measurement  of  coupling  co- 
efficients on  concave  surfaces  should  be  larger  than  that  used  in  the  planar  case. 

The  design  of  homing  missile  slot  arrays  requires  the  knowledge  of  coupling 
coefficients  and  element  patterns  on  conical  or  ogive  conducting  surfaces.  A computer 
program  based  on  the  harmonic  series  solution  for  a conical  geometry  is  extremely 
time  consuming.  To  alleviate  this  difficulty,  analytical  expressions  based  on  principles 
of  the  Geometric  Theory  of  Diffraction  (GTD)  have  been  obtained  which  include  modifi- 
cations to  torsional  geometry  of  the  existing  GTD  formulas  valid  for  the  torsionless  case. 
Based  on  these  expressions,  a computer  program  has  been  developed  for  the  calculation 
of  the  GTD  radial  or  circumferential  slot  far-field  patterns  on  conical  surfaces.  The 
numerical  results  compare  well  with  those  based  on  the  harmonic  series,  except  in  the 
forward  axial  (tip)  region,  and  in  regions  of  low  intensity  in  the  GTD  expressions, 
where  the  tip  scattering  that  was  neglected  in  the  GTD  analysis  is  of  importance. 

For  the  computation  of  mutual  coupling,  GTD  expressions  were  derived  for  the  surface 
current  due  to  a magnetic  current  element  located  on  a conical  surface,  based  on  the 
asymptotic  solution  of  the  canonical  problem  of  a circular  cylinder.  For  axial  slots, 
the  mutual  admittance  values  calculated  via  the  GTD  expressions  for  a circular  cylin- 
drical surface  for  ka-'lO  check  in  the  deep  shadow  and  for  the  near  neighbors  with 
harmonic  series  results  found  in  the  literature.  For  circumferential  slots,  the  asymp- 
totic expressions  must  include  additional  truncation  functions. 

The  current  program  includes  the  following  topics:  the  analysis  of  spiral 
elements  for  phased  array  applications;  the  performance  of  a two-dimensional  scanned 
lens  array;  modal  analyses  of  mutual  coupling  on  concave  spherical  surfaces;  ray 
methods  for  two-  and  three-dimensional  concave  surfaces;  and  the  synthesis  of  wide- 
angle  scanning  lens  arrays. 

The  program  on  propagation  and  scattering  in  media  with  arbitrary  properties 
was  motivated  by  the  recognition  that  recent  technology  requires  a knowledge  of  the 
propagation  of  electromagnetic  waves  in  physical  environments  of  increased  complexity, 
and  of  the  radiation  characteristics  of  antennas  and  the  scattering  characteristics  of 
obstacles  in  these  environments.  Examples  of  such  media  are  provided  by  ionized 
plasmas,  either  in  the  laboratory  or  in  outer  space,  whose  electrical  properties  can 
be  described  macroscopically  in  terms  of  an  inhomogeneous  or  homogeneous  isotropic 
dielectric,  an  anisotropic  dielectric,  or  a mechanically  deformable  dielectric  material, 
the  choice  of  a oarticular  model  being  dependent  on  the  circumstances  in  question.  In 
addition,  turbulent  processes  in  the  medium  may  require  the  inclusion  of  statistical 
properties  in  its  description.  When  a plasma  medium  surrounds  an  antenna  or  a scat- 
tering object,  a situation  encountered  when  a rocket  or  satellite  passes  through  the 


.1 


I 


‘ I 
1 1 


Xlll 


.kl 


THE  MICROWAVE  RESEARCH  INSTITUTE  PROGRAMS 


ionosphere  or  when  a high  speed  vehicle  re-enters  the  upper  atmosphere,  the  above- 
mentioned  processes  of  radiation  and  scattering  become  relevant  for  problems  of 
radio  communication  and  detection.  They  are  also  relevant  for  optical  detection  of 
satellites  and  other  objects  since  the  optical  signal  traverses  an  inhomogeneous  and 
turbulent  atmosphere  on  its  way  to  the  ground-based  detector. 

In  these  studies,  ray-optical  concepts  play  an  important  role.  The  explora- 
tion of  the  range  of  validity  of  ray-optical  procedures  in  propagation,  guiding,  radiation 
and  scattering  in  a general  environment,  and  the  subsequent  application  of  these  tech- 
niques to  problems  of  current  interest,  provide  the  basis  for  much  of  this  quasi-optic 
phase  of  the  electromagnetics  program. 

Particular  attention  in  the  quasi-optic  program  has  been  given  to  the  following 
subject  areas:  1)  propagation  of  pulses  in  a dispersive  environment;  2)  scattering  by 
discontinuities  in  homogeneously  and  inhomogeneously  filled  waveguides  or  ducts; 

3)  propagation  and  scattering  of  exponentially-decaying  (evanescent)  fields  in  lossless 
or  lossy  regions  by  complex  ray  techniques;  4)  inhomogeneous  wave  tracking.  To  deal 
with  the  first  category,  an  asymptotic  theory  based  on  space-time  rays  (a  generaliza- 
tion of  geometric-optical  rays)  and  plane-wave  dispersion  surfaces  has  been  developed. 
Complicated  transient  wave  processes  are  thereby  described  in  terms  of  the  motion, 
evolution,  and  interaction  of  wave  packets.  The  theory  has  been  applied  to  spatially- 
varying,  temporally- varying  and  moving  plasma  media,  and  the  results  obtained  are 
relevant  for  the  study  of  pulse  degradation,  pulse  compression  and  related  phenomena. 

In  the  second  category,  novel  applications  of  ray-optical  (i.  e,  , high-frequency) 
methods  have  been  found  to  yield  remarkably  accurate  results  for  radiation  from,  and 
scattering  by,  discontinuity  elements  in  guiding  structures,  even  in  the  relatively-low- 
frequency  propagation  range  of  only  the  dominant  mode.  The  technique  has  recently 
been  applied  to  VLF  mode  excitation  and  conversion  in  the  earth-ionosphere  waveguide, 
and  has  now  been  extended  to  multiwave  media  such  as  compressible  plasmas  or  elastic 
solids.  Moreover,  by  viewing  stable  and  unstable  optical  resonators  as  open-ended 
waveguides  along  the  coordinate  transverse  to  the  mirror  axis,  a novel  and  promising 
analytical  method  for  determining  losses  and  field  configurations  in  open  resonators  is 
presently  under  study.  The  method  has  so  far  been  remarkably  successful  in  explaining, 
by  a simple  model,  the  complicated  loss  behavior  of  eigenmodes  in  unstable  resonators, 
which  are  widely  employed  with  high-power  lasers.  This  study  is  discussed  further  in 
the  Optics  section. 

Emphasis  in  the  third  category  is  placed  on  determining  the  local  propagation 
properties  of  evanescent  fields  for  the  purpose  of  developing  therefrom  a theory  of 
propagation  and  scattering  analogous  to  the  geometrical  theory  of  diffraction  for  non- 
evanescent  fields.  While  evanescent  fields  can  be  regarded  as  traveling  along  ray 
trajectories  in  complex  space,  the  investigation  seeks  to  clarify  the  physical  signifi- 
cance of  "complex  rays"  and  their  possible  utility  in  constructing  the  fields.  Results 
obtained  are  of  interest  for  communication  with,  or  detection  of,  objects  located  in  the 
refraction  shadow  region  of  an  antenna  illuminating  an  inhomogeneous  lossless  environ- 
ment, and  for  similar  problems  involving  lossy  media;  for  gap  coupling  to  totally- 
reflected  optical  beams;  and  for  calculation  of  diffraction  losses  in  open  optical 
resonators. 

Alternative  to  the  complex  ray  approach  is  a new  theory,  which  provides  a 
means  of  tracking  inhomogeneous  wave  fields  in  real  coordinate  space.  Examples  of 
inhomogeneous  wave  fields  are  evanescent  fields,  leaky  waves  and  Gaussian  beams. 

The  advantage  of  this  theory  is  that  it  provides  direct  information  on  phase  front  distor- 
tion and  amplitude  changes  as  the  wave  or  beam  penetrates  a medium  and  is  reflected. 
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refracted  or  scattered;  in  particular,  the  theory  accommodates  scattering  events  that 
lead  to  substantial  field  distortions  as,  for  example,  scattering  of  a Gaussian  beam  by 
an  obstacle  whose  size  is  comparable  to  the  beam  width.  Results  have  been  obtained 
for  various  evanescent  wave  and  beam  scattering  problems,  and  further  studies  are  in 
progress.  A very  recent  development  involves  application  of  evanescent  wave  tracking 
to  modal  propagation  in  graded-index  optical  films  and  fibers.  This  entirely  new 
analysis  promises  to  render  tractable  a broader  class  of  index  variations  than  can  be 
accommodated  by  presently  used  modal  or  ray-optical  techniques. 

It  has  long  been  known  that  there  are  various  biological  hazards  attendant  on 
the  use  of  high  microwave  power.  The  human  is  particularly  sensitive  in  this  respect. 
There  exists,  therefore,  strong  motivation  for  the  establishment  of  tolerance  limits 
and,  ultimately,  safety  standards.  A study  of  the  effects  of  microwave  radiation  on 
the  eye,  employing  rabbits  as  subjects,  is  continuing. 


B.  ACOUSTICS 

Program  Director:  H.  L.  Bertoni 

A major  program  that  has  been  underway  for  several  years  deals  with  guided 
acoustic  waves  propagating  on  the  surfaces  of,  or  at  interfaces  between,  elastic  solids. 
Because  of  the  extremely  low  velocity  of  elastic  waves,  as  compared  to  electromagnetic 
waves,  and  their  correspondingly  small  wavelength,  these  waves  have  found  application 
in  a variety  of  miniaturized  signal  processing  devices,  whose  electromagnetic  counter- 
parts would  be  cumbersome,  or  not  at  all  feasible.  Such  devices  include  delay  lines, 
pulse  compression  filters,  band  pass  filters,  high  frequency  resonators  and,  in  conjunc- 
tion with  semiconductors,  convolvers,  correlators  and  storage  elements.  These  devices 
involve  acoustic  wave  interactions  and  are  therefore  acoustic  in  nature,  but  they  are 
fitted  with  input  and  output  electro-acoustic  transducers  and  are  employed  in  electronic 
systems. 

Motivated  by  these  applications,  the  program  has  sought  to  explore  device 
applications  as  well  as  the  basic  wave  scattering  and  coupling  phenomena  that  occur  in 
such  devices.  Many  of  these  scattering  and  coupling  phenomena  also  arise  in  methods 
which  are  used  for  the  nondestructive  testing  of  solid  materials  for  flaws,  cracks,  etc. 
Our  program  is  currently  exploring  the  ways  in  which  known  methods  of  nondestructive 
testing  or  evaluation  can  be  made  more  quantitatively  accurate,  and  also  devising  new 
methods. 


In  order  to  study  acoustic  wave  scattering  and  coupling  phenomena  in  a sys- 
tematic manner,  we  introduced  and  developed  a microwave  network  and  transmission 
line  formalism  for  acoustic  waves  in  isotropic  solids.  Using  this  formulation, 
transmission-line  representations  have  been  derived  for  bulk  acoustic  waves,  and 
equivalent  networks  were  obtained  for  several  types  of  planar  interface.  These  net- 
works in  turn  were  used  to  derive  the  characteristics  of  waves  guided  by  a variety  of 
planar  structures,  such  as  free  or  welded  plates  and  plated  surfaces. 

The  transmission  line  and  network  approach  also  forms  the  basis  for  a sys- 
tematic investigation  of  the  properties  of  several  basic  waveguides  for  acoustic  surface 
waves.  These  waveguides  are  linear  (as  opposed  to  planar)  configurations  which  serve 
to  laterally  confine  the  acoustic  fields;  the  structures  under  detailed  examination  in  the 
program  include  the  strip  and  slot  waveguides,  which  are  flat  overlay  guides,  and  the 
rectangular  ridge  topographic  and  tall  overlay  waveguides.  The  studies  are  primarily 
theoretical,  but  some  measurements  are  also  being  taken.  The  analytical  results  for 
the  strip  and  slot  waveguides  are  the  most  accurate  ones  available,  and  the  analyses 
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for  the  rectangular  ridge  structures  form  the  only  analytical  theories  to  date  for  those 
guides,  although  very  accurate  numerical  results  have  been  published  for  the  topo- 
graphic structure.  These  analytical  expressions,  furthermore,  agree  well  with  both 
our  own  measurements  and  those  of  others.  We  are  also  studying  the  properties  of 
the  ridge  structure  of  semi-infinite  height,  which  can  guide  waves  along  its  top  edge, 
in  order  to  better  understand  the  behavior  of  ridge  waveguides  of  finite  height. 

The  triinsmission  line  and  network  approach  is  not  only  useful  in  the  analysis 
of  waveguides,  but  it  has  also  permitted  quantitative  design  in  a novel  method  for  the 
efficient  excitation  of  these  linear  waveguides.  This  method  is  a two-dimensional 
version  of  the  prism  coupler  used  in  integrated  optics  for  coupling  a laser  beam  to  a 
surface  wave  on  a thin  film.  In  its  application  to  the  excitation  of  the  acoustic  strip 
waveguide,  an  efficiency  of  65%  over  a 25%  band  was  achieved  without  any  cut  and  try. 

Novel  components  for  acoustic  surface  waves  are  also  being  devised,  and  they 
are  being  analyzed  with  the  help  of  the  networks  mentioned  above.  Among  these  com- 
ponents are  simple  filters,  bulk  wave  suppressors,  power  splitters,  nonreflecting 
delay-line  taps,  and  resonant  cavities,  all  based  on  either  total  reflection  or  partial 
reflection  produced  by  a narrow  strip  of  plated  material.  These  components  have  the 
added  advantage  that  they  can  be  used  on  any  substrate,  piezoelectric  or  not.  The 
effect  of  a beam  of  finite  width  on  the  performance  of  these  components  is  presently 
under  theoretical  study.  A transmission  line  representation  for  bulk  waves  in  piezo- 
electric crystals  has  also  been  developed,  and  it  was  used  to  treat  a novel  class  of 
bulk  wave  filters. 

An  important  class  of  surface  acoustic  wave  devices  for  signal  processing 
employs  nonlinear  space  charge  effects  in  semiconductors  adjacent  to  piezoelectric 
materials.  Studies  being  performed  here  relating  to  this  class  of  devices  include 
convolvers,  an  acoustic  phase-locked  loop,  correlators,  time  compressors  and  storage 
devices.  The  use  of  semiconductors  in  proximity  to  a piezoelectric  substrate  was  first 
proposed  at  the  Polytechnic;  subsequent  work  here  has  added  to  both  the  theoretical  and 
experimental  development  of  such  devices.  Besides  the  original  work  on  convolvers, 
the  acoustic  phase-locked  loop  was  invented  here,  and  a fully  integrated  version  is  now 
under  development.  Real  time  correlation  and  time  compression  were  also  obtained 
here,  and  these  devices  are  currently  being  refined,  together  with  a storage  correlator. 

The  first  complete  theoretical  analysis  has  been  carried  out  of  the  phenomena 
associated  with  the  reflection  of  a bounded  acoustic  beam  from  the  surface  of  a solid 
at  the  angle  for  phase  matching  with  the  surface  wave.  The  primary  effect  is  a large 
lateral  shift  of  the  reflected  beam  from  the  position  predicted  by  geometrical  optics. 
Experimental  observations  carried  out  elsewhere  for  the  case  of  a beam  incident  from 
a fluid  are  in  complete  agreement  with  the  theory.  Subsequent  studies  for  a beam 
incident  from  a second  solid  have  led  to  a complete  theoretical  understanding  of  the 
performance  of  wedge  transducers  for  exciting  surface  waves;  this  theory  has  shown, 
for  the  first  time,  how  high  efficiencies  can  be  achieved  for  such  a device.  These 
phenomena  also  offer  a new  method  for  the  nondestructive  evaluation  of  the  surface 
properties  of  a solid,  and  their  use  is  currently  under  investigation. 

In  addition,  we  are  studying  the  propagation  of  waves  resulting  from  the  coupling 
of  magnetic  moments  in  a ferrite  material  to  the  elastic  stress.  The  dispersion  and 
polarization  characteristics  of  such  magnetoelastic  waves  have  been  found  for  both  bulk 
and  surface  waves. 
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C.  OPTICS 

Program  Director:  L.  Bergstein 

Great  progress  has  been  made  recently  in  broadening  the  research  program  in 
optics.  The  theoretical  studies  have  been  extended  and  the  experimental  effort  has  been 
expanded  to  include  the  fabrication  and  measurement  of  integrated  optics  structures  and 
related  thin-film  devices.  The  present  program  consists  of  the  following  topics:  the 
development  of  a new  narrow-band  matched  optical  filter  (theoretical  and  experimental), 
research  in  integrated  optics  and  related  thin-film  devices  (theoretical  and  experimental), 
and  the  investigation  of  unstable  optical  resonators  (theoretical  only).  The  topic  of 
integrated  optics  is  the  one  which  is  expanding  most  rapidly,  and  it  currently  consists 
of  several  different  projects. 

Development  is  progressing  on  a class  of  narrow-band  optical  passband  and/or 
rejection  filters  based  on  selective  reflection  from  atomic  vapors.  These  filters,  which 
can  operate  in  the  580  to  8550A  range,  are  rugged,  of  small  size,  and  have  a large  ac- 
ceptance angle  and  high  spectral  resolution.  Because  of  these  factors,  the  vapor  filters 
are  expected  to  be  superior  to  presently  available  devices  in  pollution  monitoring  and 
other  applications  which  require  simultaneous  multichannel  spectrochemical  analysis  for 
trace  elements.  They  are  also  expected  to  be  useful  in  the  detection  of  isotope  concen- 
trations, as  a narrow -bandwidth  ultraviolet  mirror  for  high  intensity  radiation,  and  in 
astronomical  observations.  Tests  performed  on  a mercu^  vapor  filter  verified  the 
theoretical  predictions.  With  a center  wavelength  of  2537A,  the  filter  has  shown  a peak 
reflectance  of  40%,  an  acceptance  angle  of  7 degrees,  and  a bandwidth  of  only  0.  14A. 

The  filter  was  used  in  an  emission  flame  photometer  to  perform  a spectrochemical  anal- 
ysis for  mercury  in  a water  solution.  The  device  was  also  used  to  take  a monochromatic 
photograph  of  the  spatial  distribution  of  excited  mercury  in  the  flame  from  which  an  esti- 
mate of  the  flame  temperature  profile  could  be  made. 

This  device  principle  is  being  extended  to  elements  which  are  more  difficult 
to  use  since  they  require  a higher  operating  temperature.  An  example  of  this  type  of 
element  is  sodium.  If  a successful  filter  can  be  built  using  sodium,  the  same  structure 
can  be  used  to  contain  a variety  of  elements.  At  present  the  fabrication  of  a cell  uti- 
lizing a nickel  housing  and  sapphire  windows  is  nearing  completion. 

The  purpose  of  the  program  on  integrated  optics  is  to  develop  film-layered 
structures  and  other  micro-optic  devices  for  eventual  application  in  optical  communica- 
tions. It  began  with  an  investigation  of  the  coupling  and  progression  of  optical  beams 
along  and  through  thin  films  and  other  optical  waveguides.  Planar  and  cylindrical 
polarization-independent  waveguides  have  been  investigated  and  were  shown  to  be  prac- 
tical. Such  waveguides  can  support  non-polarized  beams  and  can  transmit  information 
at  twice  the  rate  of  ordinary  waveguides.  For  planar  waveguides  it  was  found  that  phase 
and  group  velocity  matching  of  a TE  and  a TM  signal  can  be  achieved  by  the  addition  of 
only  a single  thin-film  layer.  Such  a scheme  can  also  be  used  in  nonlinear  interaction 
applications  to  match  the  group  velocity  of  two  optical  signals  of  different  wavelengths 
and/or  field  distributions.  Results  analogous  to  those  found  for  planar  waveguides 
have  also  been  found  for  cladded  fibers.  An  approximate  (but  general)  transmission 
line  equivalent  which  we  have  developed  for  this  class  of  structures  reduces  considerably 
the  analytical  complexity  of  the  problem  and  allows  the  determination  of  the  modal 
dispersion  characteristics  in  a fairly  straightforward  manner.  Moreover,  we  have 
found  a realizable  radially- variable  refractive  index  distribution  n(r)  for  which  all 
rotationally- symmetrical  TE  and  TM  modes  have  the  same  group  velocity  and  spatial 
power  distribution.  The  modes  and  eigenvalues  of  such  an  optical  fiber  and  its  diaper- 
sion  properties  were  found. 
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The  beam-shift  phenomenon  which  occurs  when  a beam  is  incident  on  a planar 
structure  at  the  angle  corresponding  to  a leaky  wave  supportable  by  the  structure  was 
mentioned  in  the  section  on  Electromagnetics,  This  phenomenon  was  investigated 
theoretically  in  quite  general  terms,  and  the  conditions  for  optimum  beam-to-surface- 
wave  coupling  due  to  optical  beams  of  finite  width  were  derived  for  both  uniform  thin- 
film  couplers  and  grating  couplers. 

The  techniques  developed  by  us  previously  for  treating  periodic  structures 
have  been  combined  with  the  beam-shift  considerations  and  applied  to  the  detailed  anal- 
ysis and  design  of  planar  grating  couplers.  These  integrated-optics  components  serve 
the  important  function  of  coupling  an  optical  beam  into  (or  out  of)  a surface  wave  guided 
by  thin  films.  In  particular,  an  exact  solution  of  the  boundary-value  problem  posed  by 
dielectric  gratings  has  been  formulated  and  a computer  program  has  been  set  up  to 
generate  highly  accurate  numerical  results.  In  this  manner,  the  coupling  efficiency, 
operating  conditions  and  other  data  have  been  determined  for  a large  variety  of  grating 
couplers.  In  addition,  a simplified  perturbation  procedure  requiring  much  shorter 
computation  times  has  also  been  developed  and  found  to  yield  excellent  approximations 
of  the  exact  results.  These  techniques  are  being  used  to  determine  dimensions  for  a 
class  of  blazed  gratings  that  are  capable  of  a substantially  higher  coupling  efficiency 
than  that  of  ordinary  gratings.  All  of  these  techniques  are  instrumental  in  providing  a 
range  of  simple  criteria  for  the  design  of  dielectric  gratings  having  desirable  diffraction 
and  waveguiding  properties. 

A laboratory  facility  has  been  established  and  experiments  are  now  in  progress 
to  verify  some  of  the  analytical  results  and  to  assess  the  performance  of  certain  micro- 
optic devices.  The  facility  includes  a capability  to  produce  solution-deposited  films 
with  a view  towards  making  active  devices,  and  a scanning  electron  microscope  which 
has  been  modified  to  write  patterns  for  optical  waveguides,  gratings  and  circuits.  A 
computer-controlled  digital  interface  was  developed  to  generate  and  control  the  desired 
scanning  pattern. 

A model  has  been  developed  to  describe  the  exposure  and  development  charac- 
teristics of  photoresist.  The  accuracy  of  this  model  has  been  verified  and  the  model 
has  been  applied  to  the  fabrication  of  grating  couplers.  The  couplers  so  constructed 
were  found  to  have  properties  that  agree  well  with  the  predictions  of  the  theory  men- 
tioned above. 

Several  additional  theoretical  investigations  have  recently  been  initiated 
relating  to  the  area  of  integrated  optics.  One  of  these  is  an  extension  of  the  beam-shift 
considerations,  mentioned  in  the  Electromagnetics  section,  involving  lateral  waves 
(rather  than  leaky  waves,  which  are  appropriate  to  the  grating  couplers  mentioned  above). 

These  lateral  ray  and  beam  shifts  on  boundaries  with  incidence-angle-dependent  reflec- 
tion coefficients  are  currently  receiving  special  attention  because  it  has  been  recognized 
that  the  correct  ray-optical  model  for  modal  propagation  in  films  and  layers  requires  i 

laterally  shifted  paths.  A systematic  study  of  lateral  shifts,  both  longitudinal  (Goo s- 
Hanchen  type)  and  transverse  (Imbert  type),  for  isotropic  and  anisotropic,  homogeneous 
and  graded-index,  two-dimensional  and  three-dimensional,  guiding  regions  is  presently 
in  progress. 

Another  new  theoretical  area  involves  the  determination  of  the  propagation  j 

characteristics  of  linear  waveguiding  structures  for  integrated  optics.  Under  investi-  j 

gation  are  two  waveguides  which  are  composed  of  metal-clad  regions  on  thin  films  i 

located  on  a substrate.  One  of  these  guides  consists  of  a slot  between  two  metal-clad  i 

regions:  a basic  constituent  mode  was  neglected  in  analyses  performed  elsewhere,  and  j 

we  find  that  important  performance  differences  occur  in  a certain  frequency  range.  j 
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The  other  waveguide  is  a novel  one  consisting  of  a metal  strip  on  a thin  film  on  a 
substrate.  This  guide  employs  the  "plasmon"  mode  and  is  lossier  than  other  known 
linear  waveguides,  but  it  seems  to  be  dispersionless  over  a large  frequency  range  and 
may  therefore  be  of  distinct  interest.  The  properties  of  these  two  guides  are  being 
examined  in  greater  detail. 

An  additional  theoretical  topic  is  an  application  of  the  technique  for  the  tracking 
of  inhomogeneous  fields,  mentioned  in  the  Electromagnetics  section.  As  indicated 
there,  application  is  being  made  to  modal  propagation  in  graded-index  optical  films 
and  fibers;  this  new  approach  permits  the  treatment  of  a wider  class  of  refractive  index 
variations  than  can  be  handled  by  other  techniques. 

Several  novel  thin-film  devices  have  been  devised  and  investigated  both  theo- 
retically and  experimentally.  Thin-film  polarizers,  beam  splitters,  and  (fixed  and 
tunable)  narrow-bandwidth  spectral  filters  for  the  visible  and  10-micron  region  have 
been  investigated  and  were  shown  to  be  feasible  in  both  the  usual  optical  beam-shaping 
applications  and  in  integrated  optics  applications,  where  these  signal  shaping  compo- 
nents can  become  part  of  the  coupling  scheme.  Experimental  tests  have  confirmed  the 
theoretical  results.  A five-layer  FTR  narrow-bandwidth  spectral  filter  and  a single- 
layer FTR  beamsplitter  with  transmission  properties  which  are  independent  of  the 
polarization  of  the  incident  beam  were  developed  and  tested  and  their  properties  were 
found  to  agree  with  the  theoretical  predictions. 

Another  new  project  involves  the  analysis  of  certain  geometrical  discontinuity 
problems.  As  a first  example,  we  have  solved  an  important  discontinuity  problem  for 
a class  of  planar  optical  waveguides;  the  discontinuity  consists  of  a change  in  height  of 
a dielectric  layer.  In  our  modal  expansion  approach,  which  is  fairly  general,  we  place 
the  open  waveguide  structure  between  two  conducting  planes  and  then  allow  the  separa- 
tion between  the  planes  to  reach  infinity.  This  greatly  reduces  the  complexity  of  the 
problem  and  permits  the  solution  of  a wider  class  of  problems  than  is  possible  with 
other  methods  found  in  the  literature.  The  solution  we  have  found  for  single-mode 
waveguides  agrees  well  with  those  found  in  the  literature  by  less  general  and/or  more 
cumbersome  methods. 

The  investigation  of  the  properties  of  unstable  resonators  is  continuing.  Two 
independent  approaches  are  employed;  one  involves  a longitudinal  approach  whereas  the 
other  uses  a transverse,  or  waveguide,  viewpoint,  discussed  briefly  in  the  section  on 
Electromagnetics.  Using  the  longitudinal  approach,  a meaningful,  approximate  solution 
has  been  found  for  the  complete  set  of  modes  and  associated  eigenvalues  of  optical 
resonators  filled  with  an  active  medium  having  transverse  variations  in  both  the  gain 
and  refractive  index.  This  has  led  to  a simple  physical  picture  which  oermits  substan- 
tial new  insight  into  the  mechanism  of  operation  of  such  resonators  ai  . will  help  in  the 
determination  of  the  optimum  resonator  parameters  which  are  to  be  used  in  high-power 
laser  devices. 

For  unstable  resonators  whose  transverse  boundaries  are  determined  by  the 
mirror  dimensions,  the  waveguide  approach  to  unstable  resonators  formed  by  hyper- 
bolic strip  mirrors  has  been  found  successful  in  explaining  in  a remarkably  simple 
manner  the  complicated  behavior  of  the  complex  eigenvalues  of  the  resonant  modes. 
These  results  are  now  being  extended  to  circular  mirror  configurations  and  attention  is 
being  given  to  the  effects  of  inhomogeneity  in  the  resonator  medium. 
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D.  QUANTUM  ELECTRONICS 
Program  Director:  W.  T.  Walter 


The  research  program  in  the  area  of  Quantum  Electronics  comprises  that 
portion  of  electronics  in  which  quantum  mechanical  effects  become  important  because 
of  the  discrete  nature  of  charge  carriers,  matter  or  radiation.  Optics  and  Wave- 
Matter  Interactions  are  related  research  areas  whose  boundaries  with  Quantum  Elec- 
tronics are  not  distinct.  Some  of  the  research  previously  reported  in  the  Quantum 
Electronics  area  has  been  shifted  in  this  report  to  the  Optics  or  Wave-Matter  Inter- 
action research  areas. 


A research  program  has  been  developed  which  has  attacked  outstanding  prob- 
lems of  quantum  electronics  on  a broad  front.  Significant  results  include:  the  develop- 
ment of  a new  type  of  optical  and  infrared  spectroscopy  which  has  demonstrated  more 
than  two  orders  of  magnitude  improvement  in  resolution  compared  to  previous  tech- 
niques, a corresponding  improvement  in  the  long-term  frequency  stabilization  of  lasers, 
significant  progress  in  the  understanding  of  the  irreducible  quantum  fluctuations  which 
are  present  in  the  output  of  lasers  and  other  quantum  electronic  devices,  advances  in 
the  understanding  of  the  propagation  of  ultra-short  pulses  in  resonant  media,  progress 
in  the  development  of  a new  wide-angle  low-noise  laser  receiver,  an  improved  under- 
standing of  the  properties  of  the  acousto-optical  interaction  and  the  use  of  this  inter- 
action to  provide  acoustical  gain  and  also  to  mode  lock  a ruby  laser,  a significant 
improvement  in  the  brightness  and  in  the  average  power  generation  density  of  the 
very-high-gain  pulsed  copper  vapor  laser,  construction  of  a computer  model  to  analyze 
these  pulsed  metal  vapor  lasers,  and  the  development  of  a procedure  for  calculating 
improved  values  of  atomic  transition  probabilities. 


One  group  of  projects  has  centered  on  the  study  of  the  nonlinear  response  of 
saturable  absorbers  to  intense  laser  beams.  These  projects  have  yielded  a new  type 
of  ultra-high-resolution  spectroscopy,  significant  improvement  in  the  long-term  fre- 
quency stabilization  of  lasers,  and  have  also  contributed  to  the  progress  on  an  auto- 
matically-adapting filter  which  offers  the  promise  of  a significant  improvement  in  the 
performance  of  an  optical  radar  receiver.  Laser  Saturated  Resonance  Spectroscopy 
(LSRS)  has  been  applied  to  the  study  of  the  absorption  spectrum  of  the  SF^  molecule  at 
10.  6um  and  the  I,  and  Br^  molecules  at  0.  6328wm.  A resolving  power  of  10°  was 
demonstrated,  representing  a two  order  of  magnitude  improvement  over  conventional 
spectroscopic  techniques.  The  study  of  the  LSRS  of  xenon  has  now  been  completed. 

All  of  the  observed  components  in  the  LSRS  spectrum  of  a naturally  occurring  mixture 
of  xenon  isotopes  (masses  128,  129,  130,  132,  134  and  136)  at  3.  Sum  have  been  re- 
solved and  identified.  The  narrow  Lamb  dips  have  also  been  used  to  stabilize  the 
frequencies  of  both  the  CO2  (10.  6 um)  and  He-Ne  (0.  6328  um)  lasers.  A laser  which 
is  frequency  stabilized  to  a Lamb  dip  is  a strong  candidate  for  selection  as  an  optical 
(or  infrared)  wavelength  standard,  and  will  also  be  useful  for  long-range  interferometry, 
for  deep-space  communication  links,  for  coherent  optical  radar,  and  for  seismic  studies. 


The  high  intensities  achievable  in  laser  beams  can  perturb  atomic  energy  levels 
and  radically  alter  the  position  and  shape  of  spectral  lines.  The  theory  of  these  phenom- 
ena is  being  extended  and  refined.  Recent  studies  have  centered  on  the  inclusion  of 
finite  lifetime  and  recoil  effects.  New  results  were  obtained  on  the  effect  on  the  molec- 
ular center-of-mass  motion  of  a resonant  driving  field.  The  analysis  exploited  a novel 
calculation  technique  based  on  the  application  of  Floquet's  theorem  to  periodically- 
driven  quantum-mechanical  systems.  It  is  hoped  that  these  interesting  effects  at  optical 
frequencies  will  eventually  be  observed  and  compared  with  theory. 
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Another  theoretical  investigation  in  progress  aims  at  understanding  the  irre- 
ducible quantum  fluctuations  always  present  in  the  output  of  lasers  and  other  quantum 
electronic  devices,  from  the  point  of  view  of  first  principles.  A study  has  been  com- 
pleted of  the  effect  of  an  internal  homogeneously-broadened  saturable  absorber  on  the 
intrinsic  fluctuations  of  a laser  oscillator.  A related  effort  is  concerned  with  the  under- 
standing of  spontaneous  emission  processes  in  resonant  dispersive  configurations  other 
than  free  space.  This  problem  has  been  solved  in  general  terms  by  relating  it  to  the 
problem  of  calculating  thermal  field  distributions.  Recently,  the  spontaneous  emission 
noise  and  the  response  characteristics  of  driven  quantum  electronic  systems  have  been 
evaluated  for  classes  of  spatially  degenerate  molecules.  Effects  associated  with  the 
nonstationary  character  of  the  correlation  functions  have  been  particularly  emphasized. 

These  effects  have  been  overlooked  in  recently  published  work  on  the  response  of  driven 
systems. 

The  study  of  the  propagation  of  ultra-short  pulses  in  resonant  media  has  led  to 
the  discovery  of  new  relationships  between  their  spatial  and  temporal  structures.  These 
include:  strong  self  focusing,  phase  modulation  which  accompanies  transverse  con- 
finement of  the  pulse,  additional  unsuspected  structure  of  plane  wave  pulses,  some 
effects  implied  by  the  inclusion  of  a finite  transverse  relaxation  time,  the  transverse 
stability  of  confined  fields,  new  types  of  distortionless  fields  which  simultaneously 
possess  phase  modulation,  polarization  modulation  and  three-dimensional  structure, 
and  some  effects  produced  by  a nonlinear  index  of  refraction  of  the  host  material. 

The  spectroscopic  investigations  of  saturable  absorbers  and  the  study  of 
propagation  of  ultra-short  pulses  in  resonant  media  relate  directly  to  the  feasibility 
study  of  the  low-noise  wide-angle  BALAD  (Bleachable  Absorber  Laser  Amplifier- 
Detector)  receiver.  In  this  receiver,  an  absorption  cell  (filled  for  example  with  SF^) 
is  placed  between  the  laser  amplifier  and  the  detector  to  serve  as  a threshold  filter, 
both  in  the  frequency  domain  and  in  two  spatial  dimensions,  and  it  screens  out  all  back- 
ground light  which  is  not  coherent  with  the  signal  beam.  Calculations  predict  superior  ( 

performance  in  optical  radar  and  communications  applications.  The  LSRS  studies  have  j 

clarified  the  effect  on  laser  amplifier  gain,  bandwidth  and  band  center  that  would  result  | 

from  the  use  of  separated  single  xenon  isotope  samples  in  an  optical  radar  system  with 
a BALAD  receiver.  ^ 

Visible  and  near-visible  lasers  with  an  efficiency  and  power  comparable  to  the  | 

far  infrared  CO2  laser  are  still  lacking.  The  vapors  of  certain  metals  and  transition  ele-  ] 

ments  have  attractive  energy  level  structures.  A number  of  high-temperature  laser  ! 

systems  have  been  constructed  to  explore  the  most-promising  metal  vapcy  laser  media.  1 

The  pulsed  copper  vapor  laser  has  high  gain  and  high  peak  power  at  5106A.  Moreover, 
unlike  other  self-terminating  lasers,  single  mode  output  and  therefore  high  peak  bright- 
ness can  be  obtained.  Recent  experiments  using  higher  pulse  repetition  rates  and  a 
larger  volume  have  produced  an  average  power  of  11  watts.  With  argon  as  an  additive 
gas,  average  powers  above  one  watt  have  been  achieved  at  5106A  from  an  active  region 
only  10  cm  long  and  1.25  cm  in  diameter,  corresponding  to  an  average  power  production  i 

of  0.  1 watts  per  cubic  centimeter.  A heat-pipe  copper  vapor  laser  has  been  demonstra-  ’ 

ted,  and  preliminary  experiments  with  a high-loss  resonator  have  yielded  a peak  bright-  i 

ness  of  lO-^'"  watts  per  square  centimeter  per  steradian,  which  is  the  highest  brightness 
measured  for  a gas  laser.  Heat-pipes  and  the  use  of  dissipated  discharge  energy  for 
self-heating  have  potential  for  practical  sealed-off  copper  vapor  lasers  with  average  out- 
put powers  up  to  100  watts  in  the  visible.  Still  higher  output  powers  and  overall  elec- 
trical efficiencies  of  up  to  10*7'  are  anticipated  from  the  current  investigation  of  imped-  j 

ance  tailoring  of  copper  vapor  systems.  Other  metal  vapor  systems  are  being  examined 
for  efficient  laser  action  in  the  ultraviolet  and  blue-green  spectral  regions.  A computer  j 

model  has  been  constructed  and  is  being  used  in  conjunction  with  the  experimental  anal-  j 

ysis  of  several  metal  systems.  Theoretical  analysis  of  these  systems  for  higher  effi-  j 

ciency  lasers  has  led  to  the  development  of  a procedure  for  calculating  improved  values  •! 

of  atomic  transition  probabilities  from  spectral  intensities.  ] 
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E.  SOLID  STATE  AND  MATERIALS 
Program  Director:  H.  J.  Juretschke 

The  solid  state  research  ranges  over  a wide  area  of  experimental  activities 
encompassing  the  creation  of  new  or  better  materials,  the  investigation  of  many  specific 
properties  of  solids,  and  the  application  of  these  materials  and  effects  to  practical  sit- 
uations. In  addition,  there  are  active  programs  in  various  aspects  of  solid  state  theory. 

In  the  area  of  materials  development,  an  important  sector  is  concerned  with 
new  magnetic  compounds  involving  rare  earth  and  transition  metal  ions,  and  with  the 
production  of  phosphors  for  IR-to-visible  conversion.  Programs  involving  thin  films 
include  the  development  of  high-mobility  semiconducting  layers,  and  the  preparation  of 
highest  quality  single  crystal  films  of  the  noble  and  ferromagnetic  metals  and  the  semi- 
metals. In  addition,  there  are  systematic  efforts  to  produce  good  insulating  barriers  on 
semiconductor  and  metal  single  crystal  surfaces,  for  application  to  tunneling  and  other 
devices. 


Among  the  properties  of  interest,  those  related  to  electromagnetic  interactions 
play  a central  role.  Electron  transport  in  semiconductors  is  studied  for  hot-electron 
nonlinear  effects,  and  for  surface  interactions.  Size  and  surface  effects  in  carrier  trans- 
port in  magnetic  fields,  especially  as  related  to  the  band  structure  and  to  surface  relax- 
ation mechanisms,  are  investigated  in  thin  films.  One  group  of  studies  centers  on  the 
spin  dependence  of  electrical  conduction  in  magnetic  materials,  while  another  is  con- 
cerned with  the  characteristics  of  thin  tunneling  junctions,  either  vacuum  or  insulating, 
between  normal  or  superconducting  materials.  New  research  includes  the  interaction  of 
x-rays  with  electrons  in  semiconductors  through  the  internal  photoelectric  effect,  the 
determination  of  charge  concentrations  at  metal-insulator  interfaces  by  surface  electron 
scattering,  the  details  of  metallic  surface  self  diffusion  as  it  affects  the  electrical  sur- 
face properties,  and  the  concentration  profiles  in  alloys  of  thin  films  as  affected  by  the 
large  surface-to- volume  ratio. 

Studies  in  magnetism  include  a study  of  the  magnetic  coupling  between  transition 
metal  ions  in  complex  fluorides,  using  neutron  diffraction,  magnetic  susceptibility  and 
MSssbauer  spectroscopy.  Magnetic  resonance  measurements  (NMR,  ESR)  are  being 
used  to  study  clustering  of  rare-earth  ions  doped  in  CdF2.  This  material  has  been  devel- 
oped as  a base  for  infrared  to  visible  light  conversion,  which  has  piotential  application  in 
a variety  of  display  devices.  Another  study  is  looking  at  the  origins  of  the  frequency 
dependence  of  magnetic  anisotropy  in  ferrites  and  at  the  anisotropy  of  their  magnetogyric 
ratio.  A program  in  magnetic  resonance  of  ESR  and  NMR  is  elucidating  details  of  elec- 
tronic structure  in  many  of  the  materials  used  in  other  investigations. 

Optical  investigations  include  a broad  investigation  of  the  Faraday  effect  in 
ferromagnetic  metals  that  aims  at  clarifying  details  of  their  band  structure,  especially 
as  a function  of  temperature.  At  very  short  wavelengths,  in  the  x-ray  region,  studies 
are  pursued  on  quantitative  aspects  of  the  resonant  Borrmann  transmission  in  perfect 
single  crystals  under  conditions  where  three  or  more  distinct  beams  are  in  strong  inter- 
action. Here  an  analysis  of  loss  mechanisms  has  been  critical,  and  we  have  initiated 
work  on  better  determination  of  x-ray  dispersion  coefficients,  and  on  the  energy  distri- 
bution of  the  propagating  waves  relative  to  the  atomic  sites. 

X-ray  investigations  are  concerned  with  crystal  structures,  perfection,  thermal 
effects  and  charge  distributions  in  a wide  range  of  materials.  Determination  of  lattice 
constants  through  multiple  diffraction  measurements  are  yielding  data  of  considerably 
enhanced  accuracy  and  precision.  Multiple  diffraction  effects  are  also  being  applied  in 
a careful  interpretation  of  x-ray  data,  including  linear  absorption  coefficients,  line 
shapes  and  line  shifts. 
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Theoretical  studies  include  a fundamental  program  in  electron-photon  inter- 
actions in  magnetic  fields,  and  the  theory  of  slow-electron  scattering  in  solid  surfaces. 

In  addition,  there  is  work  on  elastic  and  inelastic  interactions  between  electromagnetic 
radiation  and  atoms  and  solids  involving  excitation  of  inner  electrons,  especially  as  they 
apply  in  the  x-ray  region  and  on  the  phase-sensitive  coupling  of  optical  fields  in  solids 
to  suppress  noise.  Finally,  the  nonlinear  coupling  between  x-rays  and  other  waves  is 
studied  in  the  domain  of  dynamic  x-ray  modes. 

Another,  and  new,  aspect  of  solid  state  studies  involves  the  theoretical  and 
experimental  investigation  of  distributed  parameter  networks  on  integrated  electronic 
circuits.  There  has  always  been  a search  for  ways  to  utilize  an  existing  technology  to 
its  fullest  advantage.  By  using  a distributed  parameter  viewpoint  on  silicon  planar 
technology,  it  is  possible  to  synthesize  functions  on  chips  which  are  much  smaller  and 
less  expensive  than  chips  containing  lumped  elements  only.  The  result  of  initial  tests 
have  shown  the  feasibility  of  manufacturing  totally-integrated  voltage-tunable  RF  ampli- 
fiers, IF  strips,  FM  demodulators,  oscillators,  and  other  frequency-selective  electronic 
networks.  The  elimination  of  large  lumped  inductors  itself  results  in  an  appreciable 
saving  in  size,  weight,  material,  and  ultimately,  cost.  The  design  is  simple  and  well 
suited  to  mass  production  techniques  in  silicon  technology. 


F.  WAVE-MATTER  INTERACTIONS 
Program  Director:  N.  Marcuvitz 

Several  different  programs  comprise  the  area  of  wave-matter  interactions.  The 
most  comprehensive  and  challenging  of  these  programs  is  a relatively  new  one  which  in- 
volves the  interaction  between  very  high  power  electromagnetic  waves  and  materials,  in 
which  the  interactions  are  highly  nonlinear  and  turbulent.  Other  programs  have  been 
underway  which  treat  interactions  appropriate  to  more  moderate  energy  levels;  such 
interactions  involve  quasi-linear  processes  which  can  be  treated  either  by  equations 
possessing  weak  nonlinearity  or  by  linearized  equations,  the  latter  leading  to  a large 
variety  of  parametric  processes  which  have  been  studied  by  us  in  detail  over  a period  of 
time.  The  above-mentioned  studies  are  primarily  theoret’cal  but  they  have  experimental 
phases  as  well.  Other  programs  include  various  experimental  and  theoretical  investiga- 
tions of  plasma  wave  properties  and  plasma  turbulence  effects,  and  studies  in  space 
radiophysics,  which  include  terrestrial  and  extraterrestrial  phenomena. 

A comprehensive  program  has  been  initiated  to  study  phenomena  associated 
with  the  interaction  of  high  power  density  electromagnetic  waves  and  materials.  Its 
motivation  is  to  extend  known  techniques  on  linear  wave  propagation  in  spatially  inhomo- 
geneous and  time  varying  deterministic  media  to  practical  structures  wherein  the  mate- 
rial and  waves  are  nonlinear,  inhomogeneous,  time  varying,  and  turbulent.  Of  particular 
interest  are  self-consistent  studies  of  reflection,  transmission,  and  mode  conversion 
properties  of  high  power  waves  as  a function  of  frequency,  pulse  width  (spatial  and  tem- 
poral), etc. , for  the  range  of  power  densities  wherein  material  properties  become 
markedly  nonlinear  and  experience  phase  changes.  Preliminary  model  computer  studies 
are  being  carried  out  with  the  aid  of  an  interactive  graphic  interpretive  (IGI)  language 
developed  for  a PDPll  computer  facility;  related  experimental  and  theoretical  investiga- 
tions are  also  under  study. 

Another  program,  which  is  also  relatively  recent,  involves  weakly-nonlinear 
interactive  systems;  the  approach  adopted  employs  a scattering  formulation  applied  to 
nonlinear  transmission  lines.  A recent  result,  involving  the  general  problem  of  reducing 
a nonlinear  equation  to  a linear  one,  has  been  to  construct,  in  a simple  and  straightfor- 
ward manner,  a general  such  transformation  for  a class  of  generic  equations,  of  which 
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the  well-known  Burger's  equation  and  the  K-deV  equation  are  special  cases.  Another 
important  result  has  been  to  show  on  a rigorous  basis  that  the  energy  in  a lossless 
linear  transmission  line  is  conserved,  in  contradiction  to  the  published  literature  which 
characterizes  the  presumed  violation  of  the  conservation  law  as  a paradox. 

The  program  on  parametric  processes,  which  involve  the  linearization  of 
weakly-nonlinear  interactions,  has  been  highly  successful  over  an  extended  period  and 
now  includes  application  to  nonlinear  optics  and  to  processes  involving  plasmas.  Re- 
cent theoretical  studies  have  stressed  the  interaction  of  light  with  plasmas,  including 
the  study  of  stimulated  Brillouin  scattering  (SBS)  and  parametric-decay  instabilities. 
Each  of  these  nonlinear-optical  interactions  involves  the  ion-acoustic  wave  in  the  plas- 
ma. The  dependences  of  the  instabilities  on  the  dispersion,  absorption,  anisotropy  and 
inhomogeneity  of  the  medium  are  of  particular  concern.  The  results  of  this  investiga- 
tion have  practical  applications  in  laser-produced  plasmas  for  fusion  and  ionospheric 
probing  by  radio  waves.  Other  applications  of  parametric  interactions  are  found  in  the 
development  of  new  coherent  optical  sources  for  communications,  and  in  design  of  pulse- 
echo  devices  for  data  processing.  Theoretical  studies  have  employed  a Floquet  expan- 
sion procedure  which  takes  all  the  space-time  harmonics  into  account,  and  is  rigorous 
within  the  small-signal  regime.  The  rigorous  boundary- value  problem  and  the  initial 
value  problem  for  instability  evolution  for  an  intense  light  beam  incident  on  a medium 
are  being  solved  and  applied  to  a variety  of  physical  cases.  Instability  thresholds  in 
unbounded  regions  have  been  further  investigated  for  interesting  cases  where  two  inter- 
actions (involving  three  frequencies)  merge  into  one  interaction  (involving  four  frequen- 
cies) upon  application  of  a small  change  in  model  parameters.  Such  merger  of  interac- 
tions appears  likely  in  certain  parametric  interactions  taking  place  in  gradient  inhomo- 
geneities. The  study  of  a class  of  problems  dealing  with  intracavity  parametric  inter- 
actions has  been  initiated,  prompted  by  interesting  experimental  results  obtained  in 
rrjode  locking  a ruby  laser  using  stimulated  Brillouin  scattering  in  a birefringent  crystal. 
In  other  experiments,  the  dynamic  build-ups  of  SRS  and  SBS  phonons  in  crystals  are 
being  studied  on  a time  and  space  resolved  basis  in  order  to  illustrate  some  of  the  theo- 
retical predictions  mentioned  above. 

The  research  activities  related  to  plasmas  include  the  examination  of  plasma 
waves  and  the  study  of  plasma  turbulence.  Both  theoretical  investigations  and  laboratory 
experiments  are  being  conducted  in  these  areas.  Both  linear  properties  and  nonlinear 
effects  of  plasma  waves  have  been  and  are  being  examined,  and  related  experiments  are 
being  performed  to  verify  the  theoretical  findings.  Among  the  topics  which  have  been 
treated  are:  a)  the  propagation  characteristics  of  electrostatic  plasma  waves  in  a 
magneto-plasma;  b)  plasma  heating  using  the  resonance  damping  of  cyclotron  waves; 

c)  evolution  of  parametrically-excited  ion-acoustic  instabilities  and  plasma  heating; 

d)  propagation  of  kilo  volt- subnanosecond  base-band  pulses  along  a plasma  column; 

e)  mixing  of  electromagnetic  waves  in  a plasma  through  modulation  of  the  electron  tem- 
perature; f)  coupling  effects  between  electromagnetic  modes  and  electrostatic  modes; 

g)  examination  of  the  wave  properties  related  to  microwave  interferometry  in  finite-size 
plasmas;  and  h)  resonance  excitation  of  plasma  waves  by  a slotted  cavity.  The  effort  in 
this  area  is  being  continued  and  coordinated  with  the  study  of  plasma  turbulence. 

The  study  of  plasma  turbulence  has  been  gaining  more  emphasis  recently.  A 
comprehensive  theoretical  restructuring  of  plasma  turbulence  has  been  carried  out,  and 
new  experimental  facilities  were  designed  and  built  to  perform  sophisticated  experiments 
on  plasma  turbulence.  One  of  the  new  experimental  facilities  is  a long,  hollow-cathode 
arc  system,  which  can  produce  both  quiescent  and  weakly-turbulent  highly-ionized  plas- 
mas. Not  only  were  the  static  characteristics  of  weak  plasma  turbulence  studied,  but 
the  dynamic  effects  of  the  transition  from  the  quiescent  state  to  the  weakly-turbulent 
states  were  also  observed.  Advanced  data  processing  techniques  were  developed  to  gain 
reliable  data  on  plasma  fluctuations.  One  of  the  distinguished  achievements  in  this  area 
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is  the  clear  quantitative  evidence  of  the  plasma  turbulence  generated  by  the  electrostatic 
ion-cyclotron  waves  and  of  the  turbulent  diffusion  process.  Dynamic  or  feedback  stabi- 
lization techniques  are  being  applied  to  isolate  the  various  terms  contributing  to  the 
growth  rate  of  the  instability  and  the  transition  into  the  weak  turbulence  regime. 

Associated  with  the  investigation  on  plasma  physics  is  the  research  in  space 
radiophysics  of  natural  terrestrial  and  extraterrestrial  phenomena.  Emphasis  is  on  the 
physics  of  the  upper  atmospheres  and  ionospheres  of  the  planets  and  their  interaction 
with  the  sun's  emanations.  The  program  involves  theoretical  and  experimental  studies 
of  disturbances  in  the  ionosphere  as  produced  by  man-made  events  and  by  natural  phe- 
nomena, such  as  thunderstorms.  Our  facilities  permit  the  sounding  of  the  ionosphere 
for  disturbance  detection.  Radio  emissions  at  VLF  as  detected  by  satellites  have  also 
been  studied.  The  problem  of  the  coupling  between  waves  in  the  neutral  atmosphere  and 
waves  in  the  ionosphere  is  presently  being  studied.  Resonances  have  been  discovered 
and  it  is  planned  to  seek  confirmation  both  from  spacecraft  data  and  from  ground-based 
measurements.  Dynamic  interaction  of  the  atmosphere  and  ionosphere  of  planets  due  to 
heating  by  solar  radiation  has  also  been  studied  from  the  standpoint  of  the  outflow  of 
gases  and  ionization  as  related  to  the  formation  of  the  solar  system.  These  methods 
have  been  applied  to  the  Saturnian  Satellite  Titan  and  to  Jovian  satellites,  which  are 
unique  in  the  Solar  System. 


G.  ELECTRIC  POWER  ENGINEERING 
Program  Director:  E.  Levi 

The  recent  energy  problems  and  the  increased  awareness  of  the  deterioration 
of  the  environment  are  likely  to  bring  about  a significant  increase  in  the  share  of  energy 
utilized  in  electric  form.  In  response  to  these  mentioned  needs,  the  Polytechnic  has 
initiated  a new  research  program,  and  has  enriched  its  academic  curriculum  by  insti- 
tuting electric  power  engineering  options  at  both  the  undergraduate  and  graduate  levels. 

Under  investigation  are  many  critical  areas  in  the  generation,  transmission, 
and  distribution  of  electrical  power.  Of  particular  interest  are:  (1)  the  magnetic  separa- 
tion of  radioactive  isotopes  in  nuclear  fuels  and  waste  products;  (2)  the  application  of 
novel  technologies  to  the  development  of  individual  components,  such  as  generators, 
high  voltage  transmission  lines,  gas  insulated  substations,  circuit  breakers,  fault- 
current  limiters,  rectifiers  and  inverters;  (3)  the  stability,  reliability,  and  economy  of 
large  integrated  systems;  and  (4)  their  effect  on  the  environment  and  the  quality  of  life. 

At  the  utilization  end,  transportation  has  been  singled  out  as  the  primary  area 
of  concern,  since  it  consumes  about  one  quarter  of  the  overall  U.  S.  energy  budget  and 
one  half  of  the  oil,  and  since  it  creates  serious  pollution  problems.  In  the  U.  S.  only 
one  percent  of  the  railroad  is  electrified,  as  compared  with  more  than  90  percent  in  the 
other  industrialized  countries  of  the  world.  Besides  seeking  improvement  in  passenger 
and  freight  train  services,  new  modes  for  individual,  as  well  as  mass  transportation  are 
being  developed.  These  include  electrified  urban  thoroughfares,  as  well  as  highways. 

A significant  effort  is  devoted  to  linear  propulsion  by  means  of  iron-cored  synchronous 
motors.  These  motors  present  challenging  problems  in  electromagnetics  because  of 
their  complex  geometries;  a recent  study  involves  a new  method  for  the  determination  of 
magnetic  fields  in  air  gaps.  Another  problem  which  arises  in  such  linear  motors  is  the 
need  to  achieve  a more  accurate  assessment  of  eddy  current  losses  and  field  penetration 
in  thickly-laminated  iron  structures.  In  addition,  as  a result  of  pioneering  work  con- 
ducted at  the  Polytechnic,  novel  topologies  for  flux  inter-linkages  have  been  conceived. 
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II.  SYSTEMS 

A.  COMMUNICATIONS 
Program  Director;  L.  Kurz 

The  research  program  in  communications  and  information  processing  currently 
covers  topics  in  the  optimization  and  evaluation  of  digital  data  transmission  systems, 
robust  detection  and  estimation  techniques,  and  FM  receivers. 

The  transmission  of  digital  data  in  the  presence  of  intersymbol  interference  and 
noise  is  a classical  problem  in  communications.  The  problem  has  recently  taken  on  an 
even  more  important  significance  as  data  rates  become  increasingly  higher  and,  there- 
fore, the  effects  of  intersymbol  interference  and  noise  more  severe.  The  main  research 
effort  in  this  area  was  concentrated  on  the  optimum  and  suboptimum  signalling  and  detec- 
tion in  the  presence  of  intersymbol  interference  and  gaussian  noise,  detection  in  the 
presence  of  noise  represented  by  a mixture  distribution  model,  and  techniques  useful  in 
the  monitoring  of  channel  conditions.  In  particular,  a new  design  procedure  for  recur- 
sive minimum  mean-square  error  equalizers  was  developed,  two  types  of  detectors 
using  digital  filters  were  investigated,  suboptimum  detectors  for  signals  corrupted  by 
gaussian  and  impulsive  noise  were  analyzed,  and  two  classes  of  direct  and  indirect 
robust  estimators  of  channel  conditions  were  introduced  and  compared  to  existing  pro- 
cedures. Work  is  continuing  on  all  the  above-described  areas  of  digital  data  system 
optimization. 

Considerable  interest  generated  in  detection,  feature  extraction  and  estimation 
problems,  where  little  is  known  about  the  data  to  be  processed,  motivated  the  develop- 
ment in  recent  years  of  several  classes  of  rank  and  non-rank  robust  (insensitive  to 
underlying  distributions)  detection  and  estimation  procedures.  Though  the  rank  detectors 
tend  to  be  more  powerful,  its  non-rank  competitors  are  more  robust  and  simpler  to  im- 
plement. Thus,  the  main  research  effort  was  concentrated  on  developing  further  the 
latter  class  of  detectors  stressing  applications  in  the  detection  of  non-constant  signals 
in  mixture  distributions  and  fading  environments,  sequential  and  nonsequential  detection 
of  underwater  sounding  and  two-dimensional  data.  A parallel  effort  included  the  develop- 
ment of  robustized  estimation  techniques  which  are  applicable  in  adaptive  modes  of 
operation  in  data  processing.  In  particular,  a new  family  of  variable-threshold  non-rank 
procedures  was  introduced  and  compared  to  competitors  based  on  the  rank  procedures; 
slippage  and  analysis  of  variance  techniques  were  modified  to  operate  on  an  adaptive 
mode,  a new  class  of  quadratic  three-sample  partition  detectors,  which  is  useful  on 
change-of-scale  and  stochastic  ordering  detection  problems,  was  generated  and  shown  to 
be  an  effective  and  easy-to-implement  class  of  competitors  to  the  rank  detectors;  various 
techniques  were  developed  for  finding  optimum  scores  and  thresholds  for  partition  detec- 
tors; two  basic  theorems  pertaining  to  robustized  recursive  estimation  were  proven. 

Work  in  all  the  areas  of  robust  detection  and  estimation  described  above  will  be  continued. 

Another  study  is  concerned  with  investigating  a novel  FM  demodulator  capable 
of  combatting  the  effects  of  interfering  signals.  This  demodulator,  consisting  of  two 
phase-locked  loops  interconnected  in  a feedback  arrangement,  has  been  constructed  and 
is  being  studied.  Further  theoretical  and  experimental  analysis  will  continue. 
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B.  COMPUTERS  AND  COMPUTER-COMMUNICATION  NETWORKS 
Program  Director:  E.  J.  Smith 

The  research  program  in  Computers  and  Compute  r- Comm  unication  Networks 
covers  a number  of  topics  directed  toward  the  development  of  improved  computer 
architecture,  improved  languages  or  information  structures,  techniques  for  imple- 
menting algorithms,  data  networks,  and  message  switching  systems. 

In  computer  communications,  we  are  concerned  primarily  with  techniques  for 
the  better  understanding  of  and  improved  design  of  data  and  computer  networks.  The 
stress  is  on  message  store-and-forward  networks  with  minicomputers  used  to  combine 
or  concentrate  incoming  messages,  to  buffer  or  store  them  if  need  be,  to  route  messages 
either  dynamically,  or  following  prescribed  routing  algorithms.  There  are  interesting 
questions  here  of  the  appropriate  modeling  of  interconnected  networks  of  computers  to 
carry  out  the  necessary  analytical  work  and  of  the  appropriate  choice  of  message  statis- 
tics to  be  used  in  the  modeling.  The  combining  or  concentration  function  is  being  studied, 
with  comparisons  made  between  different  combining  techniques.  Dynamic  buffer  schemes 
are  being  studied,  with  optimum  buffer  size  and  comparison  of  various  schemes  a speci- 
fied goal.  From  the  overall  network  viewpoint  a comparison  of  various  adaptive  routing 
algorithms  is  underway,  as  well  as  methods  for  maintaining  flow  control  throughout  the 
network.  Also  being  studied  are  algorithms  for  network  design  taking  into  consideration 
topology,  capacities,  routing,  reliability,  and  other  factors. 

Three  specific  projects  have  been  completed  which  deal  with  adaptive  routing 
and  resource  allocation,  multiple  routing  networks,  and  resource-sharing  in  computer- 
communication  nodes.  The  present  effort  focuses  in  particular  upon  teleprocessing  net- 
work design  algorithms,  channel  assignment  in  mobile  telecommunication  systems,  and 
adaptive  message-switching  networks. 

Investigations  of  the  dynamic  behavior  of  computer-driven  communication  sys- 
tems are  concerned  with  the  development  of  models  which  might  be  useful  as  analytical 
tools  in  the  evaluation  of  such  systems  as  well  as  in  the  ultimate  design  of  improved 
systems.  In  order  to  provide  a focus  for  the  effort,  a particular  message-switching 
system  is  studied  in  detail  and  the  approach  is  directed  toward  the  development  of  an 
interconnectivity  model  in  which  the  system  is  viewed  as  a collection  of  program  modules, 
each  of  which  communicates  with  other  modules  through  an  interconnection  medium  via 
a particular  machine.  This  often  obscures  the  original  intent  of  the  designer  of  the  sys- 
tem and  provides  no  insights  into  the  necessary  data  structures  and  operations  for  per- 
forming message-switching  functions.  Motivated  by  these  considerations,  current  work 
is  directed  toward  the  development  of  a higher-level  algorithmic  description  language  for 
use  in  the  design  and  specification  of  switched  communication  systems.  Desirable  char- 
acteristics of  such  a language  are  defined  and,  as  an  initial  step,  the  ideas  are  applied 
to  the  description  of  a communications  scanner  channel.  The  modules  are  characterized 
by  various  parameters  such  as  core  size,  execution  time,  data  base,  communication 
linkages  to  other  modules,  etc.  ; and  the  application  of  the  model  is  viewed  with  respect 
to  the  evaluation  of  system  changes  and  debugging  difficulty,  throughput,  and  local  opti- 
mization of  individual  models.  A first  version  of  a communications-oriented  design 
language  has  been  completed  and  a current  effort  is  concerned  with  the  investigation  of 
appropriate  data  structural  forms  for  efficient  implementation  of  the  language.  Also 
recently  completed  was  an  approximate  analytic  model  of  a message-switching  system 
which  provides  the  capability  of  predicting  message-processing  delays  from  a knowledge 
of  the  message  switch  architecture  and  the  traffic  statistics,  as  well  as  permitting  one  to 
estimate  the  effect  on  performance  of  transferring  a task  from  one  processor  to  the  other 
in  a dual-processor  system.  The  current  effort  attempts  to  extend  the  model  to  a multi- 
processor system  consisting  of  microprocessor  modules. 
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In  the  area  of  switching  theory,  a dynamic  fault-test  generation  scheme  has 
been  developed  for  combinatorial  logic  circuits,  and  an  extension  of  the  same  technique 
is  being  applied  to  the  case  of  an  asynchronous  sequential  machine  in  which  tests  are 
generated  from  the  acyclic  circuit  after  a near-minimum  number  of  feedback  cuts  have 
been  made  in  the  original  circuit.  Dynamic  tests  are  sought  rather  than  static  tests 
since  the  former  do  not  require  checking  and  consequently  require  less  complex  compu- 
tation. Techniques  are  also  being  investigated  for  the  determination  of  partitions  having 
the  substitution  property,  or  other  properties  of  special  interest,  through  use  of  the 
predicate  calculus;  efficient  algorations  for  computation  are  sought. 

In  the  area  of  machine  architecture,  s'everal  studies  are  in  progress,  including 
the  exploration  of  a simple,  low-cost,  bus-oriented  computer  having  a common  memory 
and  instruction  format  for  microprograms  and  programs  as  an  array  processor,  and  the 
investigation  of  a variable-precision  arithmetic  technique  in  which  all  numbers  are 
stored  internally  within  the  computer  as  quotients  of  integers.  The  resulting  effect  upon 
computational  accuracy  and  machine  time  will  be  studied;  other  related  schemes  will  be 
sought  out  and  explored. 

C.  SAFETY,  RELIABILITY  AND  SOFTWARE  ENGINEERING 
Program  Director:  M.  L.  Shooman 

The  areas  of  Safety,  Reliability  and  Software  Engineering  encompass  a broad 
spectrum  of  analytical  modeling,  experimental  systems  research,  and  systems  engineer- 
ing. The  underlying  thread  of  cohesion  in  these  areas  is  the  application  of  modern  tech- 
niques of  probabilistic  modeling,  statistical  measurement  and  experimentation,  and  the 
development  of  design  and  optimization  techniques  within  the  engineering  process.  The 
emphasis  is  at  times  on  the  basic  development  of  the  tools  and  methodology  for  such 
studies  and  often  on  the  application  of  these  new  as  well  as  existing  techniques  to  the 
advancement  of  the  state  of  knowledge  in  the  given  area. 

This  section  describes  and  reports  the  work  of  several  diverse  groups  within 
the  Institute,  some  who  are  developing  technology  in  areas  outside  of  electronics,  but 
with  direct  application  to  electronic  devices  and  systems,  and  others  who  are  working 
within  the  mainstream  of  electronics. 

To  the  probabilistic  modeler,  in  the  abstract,  the  concepts  of  safety,  reliability 
and  availability  have  a unifying  thread.  The  probability  of  no  equipment,  human,  or  pro- 
cedural failure  which  endangers  a human  being  is  expressed  in  the  safety  index.  Simi- 
larly, if  the  failure  described  in  the  preceding  sentence  affects  system  operation  so  as  to 
cause  a failure  we  are  discussing  reliability.  Clearly,  safety  and  reliability  analysis 
have  the  same  methodology;  however,  the  definitions,  implications,  and  design  goals 
differ  significantly.  The  availability  index  is  a probability  which  measures  the  percent- 
age of  systems  which  are  up  at  any  specified  time  and  allows  one  to  model  and  measure 
the  effectiveness  of  failure-repair  dynamics.  Maintainability  is  measured  by  repair 
rate,  and  repair  and  failure  rate  combine  through  the  system  configuration  to  determine 
availability.  Lastly,  when  we  turn  our  attention  to  software  we  find  that  the  concepts  of 
system  reliability,  availability,  and  maintainability  apply  well;  however,  it  is  important 
to  emphasize  the  differences  between  hardware  and  software  in  constructing  our  models. 

Activities  in  the  safety  area  have  recently  been  stimulated  and  accelerated  by 
the  arrival  of  new  funding  from  the  Department  of  Transportation  for  the  second  year  of 
a proposed  three-year  intermodal  study  of  transportation  system  safety.  This  work  is  an 
interdisciplinary  effort  with  several  participating  departments.  Clearly,  although  much 
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of  the  transportation  vehicle  and  its  guideway  involves  mechanical  systems,  a very 
high  and  increasing  percentage  of  modern  ships,  aircraft,  rail,  and  automobile  systems 
involve  electronic  control,  communications,  sensing,  guidance,  and  so  on.  One  of  the 
chief  methodological  tools  being  applied  to  this  area  is  the  system  fault  tree.  The  most 
impor*-'r‘  theoretical  aspects  are  its  construction,  computerization,  and  collection  of 
probabi,  nc  input  data.  In  this  latter  area  some  new  techniques  of  statistical  estima- 
tion called  consensus  estimation  are  being  developed.  These  differ  from,  but  bear 
resemblance  to,  the  Delphi  techniques  used  in  forecasting.  Also  of  very  great  impor- 
tance is  the  modeling  of  the  humfcn  operator.  In  a transportation  system  it  is  impossible 
to  divorce  the  human  operator  from  the  control  system.  Although  a great  deal  has  been 
done  to  model  the  human  transfer  function,  researchers  have  had  less  success  in  mod- 
eling human  errors  which  vitally  affect  both  reliability  and  safety.  We  have  enlisted  the 
aid  of  an  experimental  psychologist  to  work  with  the  systems  engineers  in  this  area  of 
research  and  hope  that  the  cross  fertilization  of  ideas  will  lead  to  new  approaches. 

The  reliability  (including  availability  and  maintainability)  area  has  been  a focus 
of  activity  for  many  years.  Many  topological  methods  of  system  analysis  have  been 
developed  along  with  approximate  bounds  which  have  served  as  the  basic  algorithms  for 
many  of  the  computer  analysis  programs  produced  in  recent  years.  Also,  much  has 
been  done  in  applying  Markov  models  to  a wide  variety  of  problems  in  availability  com- 
putation. Recent  thrusts  involve  (1)  a queueing  theory  approach  to  availability  and  main- 
tainability which  allows  more  complex  repair  models;  (2)  modeling  of  the  inspection 
interval  problem  (applied  to  critical  systems),  so  as  to  optimize  the  benefits  in  safety 
and  minimize  the  cost;  and  (3)  the  study  of  computer-assisted  test  of  electronic  systems 
with  regard  to  design  constraints,  optimum  test  points,  and  overall  cost  minimization 
and  reliability  maximization. 


In  the  area  of  software  engineering,  a substantial  portion  of  this  effort  is  sup- 
ported by  the  Rome  Air  Development  Center.  Activities  in  this  area  are  focused  upon 
five  aspects  of  software  reliability:  a)  construction  of  probabilistic  models  for  software 
errors,  which  reflect  the  content  and  type  of  error,  removal  and  generation  rate,  pre- 
diction of  mean  time  between  failure  and  reliability;  b)  measures  of  computational  com- 
plexity based  upon  the  algorithm,  automata  theoretic  or  graph  theoretic  complexity, 
information  content  and  program  size;  c)  relationship  of  modular  and  structured  Pro- 
gramming styles  to  reliability;  d)  modeling  techniques  for  program  verification  and 
testing  including  optimal  test  strategies  for  nonexhaustive  tests;  e)  models  for  the  com- 
parison of  programming  languages,  comparative  study  of  formal  definition  languages, 
and  techniques  for  compiler  writing. 

The  probabilistic  modeling  effort  dealt  with  the  construction  of  many-state 
Markov  models  for  the  determination  of  software  availability  and  macro  models  to 
predict  number  of  bugs  and  program  reliability.  Preliminary  models  were  developed 
to  predict  operational  software  reliability.  In  the  second  phase  of  this  work,  error- 
generation  terms  were  added  to  increase  the  degree  of  realism.  A new  micro  approach 
relates  errors  and  reliability  to  a functional  path  decomposition  of  the  program. 


Several  approaches  are  being  taken  to  provide  measures  of  complexity,  both 
for  the  problem  and  the  ensuing  software.  The  initial  efforts  treated  the  complexity 
of  a function  via  recursive  function  theory.  Present  efforts  are  directed  toward 
relating  program  complexity  to  established  ideas  in  the  fields  of  automata,  communica- 
tions and  information  theory.  Recent  results  provide  a measure  of  program  length 
based  on  Zipfs  Law  and  operator/operand  count. 
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In  the  area  of  structured  and  modular  programming  two  efforts  are  being  pur- 
sued. One  effort,  presently  in  progress,  applies  modular  programming  techniques  to 
the  automatic-interactive  construction  of  programs.  In  addition,  the  statistical  design  of 
an  experiment  for  the  objective  evaluation  of  structured  vs.  nonstructured  programs  is 
in  the  planning  phase. 

In  the  program  test  area,  two  efforts  are  in  progress.  The  objective  of  the 
first  effort  is  to  develop  a method  which  computes  the  number  of  tests  necessary  to 
verify  a computer  program.  Verification  is  categorized  into  several  classes,  including 
"exhaustive”  at  one  extreme  and  "processing  of  each  instruction  at  least  once"  at  the 
other.  In  the  second  effort,  an  analytic  method  is  being  developed  for  the  selection  of 
data  for  automatic  program  testing. 

In  the  area  of  languages,  a new  effort  has  begun  to  construct  a very  high  level 
language  for  writing  program  specifications.  It  is  envisioned  that  this  language  will  lie 
between  a high  level  programming  language  (FORTRAN,  PL/l,  etc.)  and  English  prose, 
but  will  be  concise  and  yet  avoid  ambiguity. 

D.  SYSTEMS,  CONTROL,  AND  NETWORKS 
Program  Director:  D.  C.  Youla 

Research  activity  in  the  network  area  encompasses  both  the  classical  and 
modern  lumped-distributed  domains.  In  the  latter,  several  significant  breakthroughs 
have  been  made  which  are  expected  to  lead  to  an  exact  insertion-loss  synthesis  technique 
for  the  design  of  optimum  filters  incorporating  cascades  of  equi-delay  TEM  lines  and 
lumped,  lossless  two-ports.  In  particular,  it  is  expected  that  transformers  and  filters 
exhibiting  equi-ripple  performance  in  both  pass  and  stop  bands  can  be  designed  to  speci- 
fications. In  principle,  there  appears  to  be  no  reason  why  the  method  cannot  be  extended 
to  single-moded  waveguide  networks  employing  obstacles  to  produce  the  lumped  discon- 
tinuities. An  unexpected  and  important  outgrowth  of  the  above  study  has  been  the  devel- 
opment of  a new  diagnostic  digital  technique  for  probing  one-dimensional  nonuniform 
structures.  Hopefully,  the  modeling  of  dispersion  by  means  of  lumped  two-ports  will 
enlarge  the  scope  of  the  method  significantly. 

A major  topic  in  the  control  area  involves  the  application  of  classical  ideas  to 
the  design  of  feedback  controllers  for  linear  multivariable  systems.  All  efforts  on  this 
topic  have  been  completely  successful.  A totally  new  frequency-domain  technique  has 
been  developed  for  the  design  of  optimal  multivariable  controllers.  The  class  of  prob- 
lems that  fall  within  the  scope  of  this  technique  is  much  broader  than  that  encompassed 
by  the  linear,  quadratic  gaussian  approach  (LQG).  The  latter  revolves  around  the  idea 
of  Kalman  filtering  and  for  this  reason  is  unable  to  absorb  in  a natural  and  straightfor- 
ward manner  many  essential  practical  constraints.  The  group  is  presently  engaged  in 
work  which  should  result  in  an  effective  computer  implementation  of  the  optimal  con- 
troller. The  availability  of  such  an  algorithm  will  undoubtedly  suggest  related  simpler 
suboptimal  strategies  and  open  the  door  to  significant  industrial  applications. 

Work  on  the  control  of  stochastic  systems  has  been  focused  on  situations  in 
which  discrete  events,  such  as  equipment  failures  or  message  arrivals,  occur  at  random 
times.  Recursive  optimization  equations  have  been  derived  in  many  cases  via  dynamic 
programming.  Novel  techniques  have  been  devised  to  solve  those  equations  to  get  opti- 
mal decision  rules  for  diverting  messages  or  vehicles  to  alternate  routes  in  situations 
with  simple  arrival  and  service  time  probability  distributions.  In  more  complex  cases, 
structural  properties  of  the  optimal  controllers  have  been  derived  and  specific  control 
algorithms  have  been  compared  via  simulation. 


XXX 


THE  MICROWAVE  RESEARCH  INSTITUTE  PROGRAMS 


In  the  area  of  reliability  applications,  both  repairman  assignment  and  optimal 
inspection  problems  have  been  solved.  The  problems  studied  here  have  been  of  a higher 
order  of  complexity  than  those  generally  formulated  by  system  analysts.  In  one  case, 
both  feedback  control  for  dynamic  tracking  accuracy  and  repairman  assignment  policies 
were  simultaneously  optimized  for  repaiiable  systems.  In  another,  techniques  were 
developed  for  optimizing  inspection  schedules  when  the  corresponding  tests  degrade  the 
lifetimes  of  the  very  components  whose  status  is  being  checked. 

Identification  and  parameter  estimation  from  random  data  is  of  central  interest 
in  systems  applications.  In  control  applications,  one  must  have  better  knowledge  about 
the  plants  as  well  as  effective  models  of  certain  plant  elements  or  functions  for  which 
there  are  no  intrinsic  analytical  descriptions  available.  This  is  especially  true  when 
human  elements  are  in  the  loop  and  we  try  to  describe  neuromuscular  systems  such  as 
arm  and  leg  control,  pupil  response,  etc.  We  also  require  identification  and  modeling 
techniques  when  the  system  itself  is  too  complex  so  that  its  fundamental  input-output 
properties  are  unknown.  This  is  especially  true  of  large-scale  economic  systems  where 
dynamic  models  are  obtained  in  some  optimal  fashion  from  available  data  in  order  to 
describe  the  evolution  of  the  system  as  a result  of  changing  policies.  Finally,  even  in 
the  case  for  which  dynamic  equations  describing  a system  are  known  or  accepted, 
parameters  are  often  unknown  and  must  be  estimated.  Identification  and  estimation 
techniques  have  been  applied  to  all  of  the  problems  described  above.  But,  to  a great 
extent,  the  basic  feature  of  previous  applications  of  identification  and  parameter  estima- 
tion is  that  the  models  of  the  systems  studied  have  been  time  invariant,  and  their  param- 
eters have  been  estimated  from  statistically  stationary  data  (after  trends  have  been 
removed).  However,  there  exist  important  phenomena  for  which,  even  after  trends  have 
been  removed,  the  data  is  intrinsically  non-stationary,  characterized  usually  by  a period 
of  increasing  and  then  decreasing  intensities.  Such  characteristics  are  found  in  eco- 
nomic systems,  but  are  even  more  prevalent  in  geophysical  observations  such  as  mete- 
orological fluctuations  and  geological  phenomena  such  as  earthquake  excitations.  Moti- 
vated by  this  last  application  to  characterize  the  statistical  properties  of  earthquake 
excitations,  recent  studies  have  led  to  new  limit  theorems  for  estimating  significant 
parameters  in  models  of  non-stationary  time  series,  and  has  led  to  a specific  approach 
for  modeling  earthquake  acceleration  data.  This  approach  has  been  implemented  on 
computers  and  is  being  applied  to  study  recent  earthquake  acceleration  data  obtained 
from  the  Western  United  States.  We  are  just  at  the  beginning  in  developing  techniques 
for  the  identification  of  non-stationary  systems  and  data.  Many  problems  remain  to  be 
solved. 


E.  DATA  PROCESSING 
Program  Director:  A.  Papoulis 

Current  studies  in  data  processing  include  the  following  aspects  of  picture  (or 
tabular)  processing  and  spectral  estimation:  statistical  enhancement  and  extraction 
techniques  for  pictorial  or  tabular  data;  reduction  of  diffraction  effects  in  the  imaging 
of  coherent  and  incoherent  objects;  image  distortion  resulting  from  atmospheric  turbu- 
lence; and  statistical  analysis  in  spectral  estimation. 

Statistical  enhancement  and  extraction  techniques  for  pictorial  or  tabular  data: 
This  research  concentrates  on  developing  procedures  to  present  the  data  in  a useful 
form  or  to  present  the  data  in  some  way  in  which  factors  in  the  data  which  are  "almost 
invisible"  become  plainly  visible.  The  effort  was  concentrated  on  factor  analysis  and 
masking  techniques.  The  masking  operation  was  taken  to  mean  the  process  in  which  a 
"window"  or  "mask"  is  swept  across  the  data  either  electronically,  mathematically. 


THE  MICROWAVE  RESEARCH  INSTITUTE  PROGRAMS 


mechanically,  or  some  combination  of  these.  The  motives  for  selection  of  a particular 
mask  are  simplicity  and  effectiveness.  The  preliminary  results  indicate  that  both  ap- 
proaches to  the  enhancement  and  extraction  problems  of  tabular  data  are  promising,  and 
the  research  effort  utilizing  these  and  related  techniques  will  be  continued. 

Reduction  of  diffraction  effects  in  the  imaging  of  coherent  and  incoherent  ob- 
jects; The  known  methods  of  deconvolution  in  one  and  two  dimensions  for  analog  and 
digital  data  are  compared  for  accuracy  and  computational  economy.  A new  technique 
is  developed  for  the  complete  recovery  of  objects  of  finite  size.  The  technique  is  based 
on  an  iteration  scheme  involving  only  the  FFT.  It  is  shown  that,  in  the  absence  of 
noise,  the  iteration  converges  to  the  unknown  object.  The  effects  of  noise  and  the  ali- 
asing errors  are  determined  and  it  is  shown  that  they  can  be  controlled  by  early  termina- 
tion of  the  iteration. 

Image  distortion  resulting  from  atmospheric  turbulence:  In  the  recent  litera- 
ture, a method  has  been  proposed  for  a dynamic  correction  of  this  distortion.  The 
underlying  filtering  (Poisson  noise)  and  prediction  (control  delay)  problem  leads  to  a 
two-dimensional  Wiener-Hopf  equation  whose  solution  is  under  investigation  under 
various  assumptions  about  the  spectrum  of  the  turbulence.  The  reverse  problem  of 
determining  the  properties  of  the  medium  in  terms  of  image  of  a moving  sattelite  has 
also  been  considered. 

Statistical  analysis  in  spectral  estimation:  The  current  emphasis  is  on  the 
statistical  analysis  of  the  method  of  maximum  entropy  and  in  comparison  with  other 
methods  for  various  special  forms  of  the  unknown  spectra.  An  adaptive  perturbation 
scheme  is  under  consideration  for  updating  the  estimated  spectra  as  new  information  is 
received.  The  convergence  of  the  scheme  to  the  smoothed  spectrum  is  under  examina- 
tion. 
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ELECTROMAGNETICS 


MUTUAL  COUPLING  IN  ARRAYS  ON  CONCAVE  SURFACES 
H.  Ahn  and  A.  Hessel 

This  work  is  devoted  to  the  development  of  the  ray  analysis  of  coupling  coeffi- 
cients in  collector  arrays  of  aperture  elements  on  the  interior  concave  surfaces  in 
scannable  feed-through  lenses  of  the  Dome  Antenna^  fyp®-  In  such  a lens,  the  energy 
is  radiated  by  a planar  feed  array  which  is  located  on  the  diameter  of  a passive  lens 
shaped  as  a dome.  This  energy  is  received  by  the  elements  of  the  collector  array  lo- 
cated on  the  interior  concave  side  of  the  dome  and  is  transmitted  via  a bank  of  passive 
phase  shifters  to  the  radiator  elements  placed  on  the  exterior  convex  side  of  the  dome. 
To  assure  an  efficient  energy  transfer,  both  the  collector  and  the  radiator  elements 
must  be  matched  in  their  respective  array  environments  for  an  appropriate  range  of 
incidence  angles  and  over  a desired  frequency  band.  To  ascertain  the  degree  of  mis- 
match the  designer  usually  avails  himself  of  experimentally  determined  coupling  co- 
efficients, which,  in  turn,  for  a given  excitation  determine  the  collector  and  the  ra- 
diated array  efficiency  (mismatch  loss).  The  virtues  of  the  ray  method  which  is  being 
developed  for  arrays  on  concave  surfaces  are  that,  in  contrast  with  the  modal  techni- 
ques, a)  they  are  not  restricted  to  a separable  array  surface  geometry  and  b)  they  per- 
mit the  exploitation  of  the  planar  phased  array  computer  programs.  The  ray  fields 
employed  here  are  not  of  the  usual  type  in  the  presence  of  a perfectly  conducting  sur- 
face, but  rather  represent  locally  plane  waves  radiated  by  the  excited  element  and  re- 
ceived by  a collector  element,  weighted  by  their  patterns  in  a locally  planar,  matched 
array  environment.  These  locally  plane  waves  are  received  either  via  a direct  ray  or 
after  one  or  more  internal  reflections.  The  ray  internal  reflection  coefficients  corres- 
pond to  a plane  wave  impinging,  in  the  ray  direction,  on  a locally  planar  array  with  the 
geometry  appropriate  to  the  point  of  reflection. 

The  collector  array  model  is  a uniformly  spaced  circular  array  of  N open  ended 
"single  mode"  parallel-plate -guides  in  a concave  cylindrical  surface  (Figure  1).  The 
waveguides  propagate  only  the  dominant  mode  of  the  appropriate  polarization,  and  are 
equipped  with  a network  that  would  match  a corresponding  planar  array  at  broadside. 
Modal  analysis  in  a unit  cell  yields  the  active  (phase  sequence)  aperture  admittance 
Y^(n)  and  reflection  coefficients  r(n)  in  the  form 

Y (0)  - Y (n) 

r(n)  = ^ , (n  = 0,  ± 1 ...  ±N/2)  . (1) 

y;(0)  + Y^(n) 

P ^ 


For  TM  case, 


Y^(n) 


J (kr  )/j'  (kr  ) 
n ' o"  n o 
m m 


(2) 
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Fig.  1.  Cylindrical  collector  array  model,  ray 
structure  and  matching  network. 

and  for  TE  case. 


2 , J (kr  ) 

cos  n b/2r  n ' o 
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The  coupling  coefficients  are  obtained  from  Eq.  (1)  via 
N/2 

S,  = k E„,  • (-ll'sf ) ■ <’> 

n=  -N/  Z 

Here,  n = n+  mN,  Y (0)  = G (0)  + jB  (0)  and  J (kr  ) is  the  Bessel  function  of  the  first 
’ m p p p n o 

kind  and  nth  order.  Scimples  of  numerical  results  based  on  Eqs.  (1)  -(4)  are  shown  in 
Figure  2.  They  indicate  that  for  d/X  < l/2  the  ] curve  follows  initially  that  of  a 

planar  array,  but  that  gradually  the  decay  rate  of|S^|  decreases,  the  curve  |S^  j vs.  f 
passes  through  a dip,  with  a subsequent  rise,  accompanied  by  a distinct  ripple.  These 
features  are  similar  for  both  the  E-  and  H-plane  coupling  coefficients. 


To  develop  the  ray  method,  Eqs.  (1)  - (4)  for  TM  are  cast  into  a form  that  per- 
mits an  asymptotic  evaluation  of  S.  . This  is  accomplished  by  imbedding  the  cylindri- 
cal  collector  array  in  an  infinite  angular  space.  * As  a result,  the  unit  cell  Floquet 
phasing  vis  no  longer  limited  to  integral  values  as  in  Eqs.  (1)  - (4),  but  becomes  con- 
tinuous. Equations  (1)  - (4)  transform  into 


Y (0)  - Y (v) 
r(v)  = r(lvl)  = ^ 


Y^(0)  + Y (v) 

P ® 


(5) 
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Fig.  2.  E-plane  coupling  coefficients  vs.  collector  array  radius. 
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-jv~  (i  + pN) 
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In  Eq.  (7)  the  index  p corresponds  to  the  pth  image  source  in  the  infinite  angular  space. 
Using  the  Debye  approximation  for  J j « the  integrand  in  Eq.  (7)  is  expanded  in  a 

Neumann  (multiple  scattering)  series  (actually  only  a finite  number  of  terms  is  per- 
mitted). After  a stationary  phase  evaluation,  each  term  yields  a ray  contribution  to  . 
The  n times  reflected  ray  contribution  from  the  p = 0 source  is 

^(n)  _ [Tp(ksin9;^)  [f(ksin9;)f  -jkr^2(n+ l)cos  0^ 


Here  T^  is  the  planar,  equivalent  array  transmission  coefficient  for  a scan  angle  0^  off 
broadside.  Fis  the  equivalent  planar  array  voltage  reflection  coefficient  for  plane  wave 
incident  at  0^,  the  angle  0^  being  that  between  the  ray  direction  and  the  local  array  nor- 
mal at  the  excited  element.  It  is  seen  that,  while  the  direct  ray  (n=  0)  does  not  cross 
a caustic,  the  reflected  rays  do,  as  indicated  by  the  factor.  Equation  (8)  can  be 

cast  into  an  invariant  form  involving  the  locally  planar  element  patterns,  reflection  co- 
efficients and  the  distance  between  the  excited  and  the  receiving  element.  Figure  3 
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4 

shows  a comparison  between  the  numerical  results  for  js^  1 obtained  by  modal  and  ray 
analysis.  The  latter  includes  the  direct,  singly  and  doubly  reflected  rays.  The  agree- 
ment is  excellent,  (in  phase  as  well)  up  to  the  dip  region,  where  the  Airy  approximation 
of  J must  be  employed.  This  is  presently  under  study. 

I Sill  dB 


Fig.  3.  Coupling  coefficients  in  a cylindrical  collector  array  - ray  vs.  modal  analysis. 

The  ray  method  clarifies  the  formation  of  the  jS^  j curves.  The  portion  past  the 
dip  is  due  primarily  to  the  direct  ray  contribution  modulated  by  the  interference  between 
the  direct  and  singly  reflected  ray.  The  portion  near  the  source  corresponds  to  a nearly 
planar  behaviour  and  the  dip  represents  a transition  region  between  the  quasi  planar 
regime  near  the  source  and  the  ray  region. 
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CAVITY  RESONANCES  AND  MUTUAL  COUPLING  IN  CONCAVE  WAVEGUIDE  ARRAYS 
H.  Steyskal 

A systematic  design  of  feed-through  lens  or  dome  antennas^  requires  knowledge 
of  the  mutual  coupling  between  waveguide  elements  on  the  concave  interior  surface  of  a 
lens.  Considering  the  lens  as  a lossless  cavity  (spherical  or  cylindrical)  through  which 
the  waveguides  are  coupled  allows  an  equivalent  multiport  network  description  of  the 
feed-through  system.  A complication  in  this  approach  arises  from  a possibly  large 
number  of  degenerate,  resonant  cavity  modes,  since  the  cavity  is  large  in  terms  of 
wavelength  and  also  is  highly  symmetric.  This  will  be  the  topic  of  the  present  study 
and  it  will  be  shown  that  at  a cavity  resonance  the  mutual  coupling  depends  critically  on 
the  relative  number  of  waveguide  modes  and  resonant  cavity  modes. 

In  order  to  determine  the  scattering  matrix  for  the  network  in  Fig.  1,  we  solve 

Q 

for  the  transverse  electric  field  E over  the  aperture  surface  S,  which  constitutes  the 


CAVITY 

Fig.  1.  Concave  waveguide  array  - - A cavity 
coupled  multiport  network. 

interface  between  the  waveguide  apertures  and  the  cavity,  in  terms  of  the  incident  field 
E^  at  S.  This  then  gives  rise  to  a reflected  field  E*'  = E^-  E*^. 

Following  Galerkin's  method  we  expand  the  transverse  waveguide  fields  into  a set 
of  orthonormal  waveguide  modes  e^  (n  denoting  here  both  the  mode  and  the  waveguide) 
and  the  cavity  field  into  radial  waves  with  transverse  orthonormal  real  mode  functions 
e^.  The  aperture  field  E is  approximated  by  a finite  set  of  modes  with  unknown  am- 
plitudes v^. 


(1) 
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Enforcing  the  continuity  of  the  tangential  electric  and  magnetic  fields  across  S and  the 

vanishing  of  E®  over  the  conducting  parts  of  S (see  Ref.  2),  leads  to  an  equation  for  E®, 

Finally,  the  requirement  that  both  sides  of  this  equation  have  equal  projections  on  the 

N 

set  spanned  by  the  basis  leads  to  the  matrix  equation 

2Yv'  = Cv  (2) 

Here  the  column  vectors  v^  = {v^}^  and  v = {v  denote  the  incident  and  the  total 

n i n 1 

modal  voltages  at  S.  The  matrices  Y and  C are 

Y = [y  6 1 (3) 

“^'m  mn-* 


K _ ^ 

C = [y  6 + y y (e,  , e )(e,  , e )] 

'■'m  mn  J'k  k’  m''  k'  n'-" 

k=l 


(4) 


where  y^  denotes  the  waveguide  modal  admittance,  y^^  the  modal  input  admittance  look- 
ing into  the  cavity  from  the  aperture  S and  the  inner  product  is  defined  by 


(e  , e ) = r e • e dS 
m n'  “s  n 


(5) 


The  solution  of  Eq,  (2)  is  usually  obtained  by  inverting  the  matrix  C.  However,  when  a 
resonance  of  the  cavity  is  approached  all  elements  of  C become  infinite  (since  some 
yj^  *)  and  a more  careful  analysis  is  required. 

Consider  a cavity  with  R degenerate  modes  so  that  y , = y,  = • • • = y,,  • At  reso- 
nance,  when  y^^  -*■  oo,  we  factor  out  y^^  from  C to  obtain 


C = yj^  (A  + E) 


(6) 


where  the  elements  of  A and  E are 


i = 7/  ® )(e,  , e ) 

mn  , ^ ' k m”  k n' 
k = l 


.e_)] 


*mn  * f^m^mn  ' ^ ’'k'^'k’ ''m''''k’ '^m 

k=R+l 


{^) 


(8) 


and  thus  Eq.  (2)  becomes 


2 Y V = y (A  + E)v 


(9) 


Now,  the  desired  mutual  coupling  coefficient  between  the  ports  p and  q may  be 
determined  by  Cramer's  rule  as 
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S 

pq 


••a,  ,+E,  ,,0,a,  ,,+E,  ,,,  • 

l,p-l  l,p-l  ' l,p+l  l,p+l 

■'®in‘''*in 

^Nl'^’^Nl*  ■ 

■ ' ^N.  D-  1 '''  ‘^N.  D-  1’  ^N.  D+1  '"n.  p+1’  ■ 

' '^nn‘*'‘^nn 



vjj=0,n;(q 

“ll"‘ll' 

'IN  IN 

“N1  Nl' 

’ NN  NN| 

(10) 


In  order  to  evaluate  Eq.  (10),  which  usually  is  indeterminate,  when  y -*  oo,  the  two 

R 

determinants  are  expanded  in  powers  series  of  which  leads  to 


lie 


N 

z 

If  j=l 

i^Pii^q 


£ . aP'  + 
ij  qj 


N 

E 

k;  t=l 


E aP^^  + 
kf  qjf  ^ 


pq 


Here  A 


E a'  + E 

l,J  = l ^ ^ l.J  = l 


k,T=l 

k/i,  f =j 


TTik 

E,  . A.,  + 

kf  jf 


(11) 


£ Ic 

denotes  the  determinant  of  A and  the  cofactors  A.  ' 

},t, 


are  the  determi- 


nants of  the  matrix  A with  rows  i,  k,  • • • , s and  columns  j,  f , • • • , t deleted  and  taken  with 

3 “ ^ 

their  proper  signs.  Note  that  | A I and  all  A's  are  constant,  independent  of  y_  . 

R 

Equation  (11)  is  a general  expression  valid  at  all  frequencies.  Its  particular  ad- 
vantage is  that  it  readily  yields  the  coupling  coefficients  at  a resonance.  When  rank 


A = N the  numerator  vanishes  as  y. 


■ <Xi , whereas  the  denominator  is  equal  to  | A | ^0 
When  rank  A < N,  the  numerator  and  denominator 


and  consequently  S also  vanishes 

pq  - 

contain  the  same  powers  of  y^^  since  is  proportional  to  l/Vj^  s-Eid  therefore  will 
tend  to  a finite,  non- zero  value  given  by  Equation  (11). 


For  more  insight  into  the  resonant  situation  we  return  to  Eq.  (9)  and  note  that  in 

view  of  Eq.  (8),  Av  — 0 when  y_  -►  «> . Writing  A as  the  product  of  an  RxN  matrix  B 

- ^ t 

with  elements  b = (e  ,e  ) and  its  transpose  B we  have 
mn  m n 


Av  = B*^Bv  = 0 (12) 

where  it  can  be  shown,  by  premultiplying  with  v^,  that  the  matrices  A and  B have  equal 
rank.  Equation  (12)  therefore  is  tantamount  to 
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(ej.ej),---,(ej.ej^)\  /vj 


(Cr,  ej),  • • •.  (ej^.  \vj^ 


or,  in  other  words. 


(E’,  e^)  = 0 r = l,2,---,R  . (14) 

As  a result  we  thus  obtain  the  general  condition  that  at  a resonance  the  aperture  field 
E*  must  be  orthogonal  to  the  set  of  degenerate  resonant  cavity  modes  Physical- 

ly this  means  that  when  the  modal  input  admittance  y^^  becomes  infinitely  large  the 
aperture  field  must  not  couple  to  the  corresponding  cavity  mode,  since  it  would  excite 
an  infinite  H- field. 

The  solution  of  Eq.  (13)  depends  on  the  rank  of  B.  The  alternatives  are: 

(1)  rank  B = N,  in  which  case  v = 0 and  all  mutual  coupling  coefficients  vanish 

(2)  rank  B < N,  in  which  case  solutions  v^O  and  non-vanishing  coupling  coefficients 
given  by  Eq.  (11)  are  obtained.  Clearly  this  condition  occurs  whenever  R<  N. 

However,  it  can  also  occur  when  R>N,  namely  when  the  set  of  aperture  modes  does  not 
couple  to  one  (or  more)  of  the  resonant  cavity  modes  or  some  linear  combination  there- 
of. This  effectively  eliminates  the  mode  from  the  analysis  and  makes  the  columns  of  B 
linearly  dependent.  Since  an  infinitesimal  change  in  relative  aperture  positions  would 
make  them  independent  the  corresponding  solution  is  discontinuous  with  geometry.  The 
number  of  such  cases  is  limited  and  forms  a set  of  zero  measure  (see  the  example  be- 
low). 

A conclusion  to  be  drawn  is  that  for  a method  of  moment  analysis  of  the  cavity  a 
sufficient  number  of  aperture  modes  should  be  used  in  order  to  resolve  the  cavity  reso- 
nance. For  a circular  cylindric  cavity  this  leads  to  N>  2,  and  for  a spherical  cavity 
N>  2ka  +1,  which  can  be  a very  large  number,  ka  being  the  great  circle  measured  in 
wavelengths. 

A ■ An  Example 

Consider  a two-dimensional  circular  cavity  with  two  identical  narrow  slits  as 
shown  in  Figure  2.  We  approximate  the  aperture  field  of  each  slit  by  one  single  <t>-di- 
rected  mode  so  that  E®  = ^'^2®2’  slits  are  narrow  we  set  e^  = Q’6(<t)-  4>^) 

and  e2  = o6(<()-  (p^),  where  a is  the  appropriate  normalization  constant.  Similarly,  the 
transverse  mode  functions  e^  of  the  cavity  are  purely  (t)-directed  and  due  to  the  rotation- 
al symmetry  they  come  in  pairs,  distinguished  by  a cos  n<))-  and  sin  n<()- variation  only. 

The  set  is  thus  composed  of  { cos  si"  where  p^  again 

are  normalization  constants. 
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Now  consider  the  case  of  modes  e^  = cos  M<j>  and  e^  = sin  M<t>  at  resonance. 


Equation  (13)  becomes 


(ei,ei)  (ej.e^)!  fvj 


(e^.ei)  (e^.e^)!  |v 


cos  . M(j)j  cos  M4)2 


^ sin  M4)j^  sin 


We  see  that  in  most  cases  rank  B = 2,  which  leads  to  v = 0 and  Sj^j  = ®22~  ®12~  ®21~^’ 

However,  when  the  slits  are  spaced  a multiple  of  tt /M  apart  the  rank  is  reduced  so  that 
rank  B=  1.  Then  we  use  Eq.  (13)  to  obtain  the  scattering  coefficients 


®ii=  "22=y 


l|  Yl  + ^n  ^n  ^^‘**1  ‘ ‘*’2^  ' ‘**2^^  | 


®12  = ®21  = ’*11  ^^‘•’l  ■ ‘*’2>- 


U.S.  Army  BMDSC 
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SHADOWING  OF  AN  INHOMOGENEOUS  PLANE  WAVE  BY  AN  EDGE 
H.  L.  Bertoni,  A.C.  Green  and  L.  B.  Felsen 

At  high  frequencies,  electromagnetic  fields  may  be  classified  as  propagating  or 
evanescent,  depending  on  whether  their  direction  of  propagation  is  real  or  complex. 
Examples  of  evanescent  fields  are  those  occurring  on  the  edges  of  a Gaussian  beam 
and  the  inhomogeneous  plane  wave  produced  in  the  rarer  medium,  as  a result  of  total 
internal  reflection.  When  either  type  of  field  encounters  a reflecting  obstacle  of 
finite  extent,  such  as  a knife  edge,  a shadow  zone  is  formed  behind  the  obstacle.  The 
formation  of  the  shadow  zone  is  well  understood  for  the  case  of  propagating  fields, 
and  can  be  described  in  terms  of  ray  optics. 

For  the  case  of  evanescent  waves,  the  mechanism  of  shadow  formation  is  not 
understood.  In  addition,  the  meaning  of  the  shadow  region  is  in  question,  since  the 
incident  evanescent  field  may  be  weaker  in  the  shadow  than  the  diffracted  field  gen- 
erated by  the  obstacle.  In  order  to  study  the  mechanism  of  shadow  formation  and  its 
significance  for  evanescent  fields,  we  have  considered  the  problem  of  inhomogeneous 
plane  diffraction  by  a perfectly  conducting  semi-infinite  screen.  By  a careful  asymp- 
totic treatment  and  error  analysis  of  the  known  exact  solution,  the  location  of  the 
shadow  boundary  is  determined,  as  is  the  transition  region  surrounding  it  wherein  the 
field  cannot  be  separated  into  incident  and  diffracted  constituents.  It  is  found  that  for 
strongly  evanescent  fields,  the  shadow  boundary  and  transition  zones  differ  markedly 
from  those  for  a homogeneous  plane  wave. 

A,  Limitation  of  Ray  Optics  for  Evanescent  Fields 

The  fields  of  propagating  waves  can  be  described  in  terms  of  a family  of  rays 

1 2 

whose  paths  in  space  are  determined  by  the  laws  of  geometrical  optics.  ' As  il- 
lustrated in  Fig.  1,  for  a plane  wave,  when  the  ray  family  encounters  a reflecting  ob- 
stacle of  finite  extent,  such  as  the  knife  edge,  some  of  the  rays  miss  the  obstacle  and 
are  unaffected  by  its  presence.  The  portion  of  the  ray  family  illuminating  the  obstacle 
is  reflected.  The  rays  separating  the  two  portions  of  the  i'cident  ray  family  will  be 
referred  to  as  critical  rays.  Extension  of  the  critical  rays  past  the  obstacle  defines 
the  boundary  of  its  shadow.  Treating  the  critical  rays  as  if  they  were  reflected  de- 
fines the  shadow  boundary  of  the  family  of  reflected  rays  (see  Figure  1).  In  addition, 
the  critical  ray  excites  a family  of  diffracted  rays,  propagating  away  from  the  edge  in 
all  directions. 

Except  in  transition  regions  that  are  of  finite  width  about  the  shadow  boundaries, 
the  fields  associated  with  the  incident  rays,  the  reflected  rays  and  the  diffracted  rays 
are  all  distinct  and  can  be  computed  using  the  Geometrical  Theory  of  Diffraction  (GTD). 
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Fig.  1.  Shadow  formation  and  diffraction  for  a homogeneous 
plane  wave  incident  on  a conducting  half- screen. 

Within  the  transition  regions  one  cannot  find  the  field  using  the  simple  GTD  formulas, 

which  are  discontinuous  or  divergent  at  the  shadow  boundaries.  Instead,  modified 

expressions  that  give  a continuous  variation  of  the  field  across  the  boundary  must  be 
3 4 

employed.  ’ The  failure  of  the  simple  GTD  formulas  reflects  the  fact  that  the  rays 
of  two  different  families  are  nearly  parallel  in  the  transition  region,  and  therefore 
the  two  families  do  not  have  distinct  physical  properties  there.  Thus  while  a pre- 
cisely defined  shadow  boundary  can  be  envisioned  mathematically,  the  character  of  the 
fields  in  the  transition  region  is  such  that  the  boundary  cannot  be  localized  with  in- 
finite precision,  except  in  the  limit  of  zero  wavelength. 

Evanescent  fields  in  a lossless  medium,  cannot  be  described  in  terms  of  rays 

5-9 

whce  paths  lie  in  real  space,  even  at  high  frequencies.  The  direction  of  propaga- 
tion of  the  fields,  and  hence  the  angle  made  with  some  real  coordinate  axis,  is  com*- 
plex.  When  such  fields  are  diffracted  by  an  obstacle,  the  shadow  boundary  cannot 
therefore  be  defined  by  the  critical  rays  described  above.  While  it  seems  evident 
that  the  incident  wave  illuminates  certain  regions  and  is  blocked  by  the  obstacle  from 
reaching  other  (shadow)  regions,  it  is  not  a priori  evident  where  the  shadow  boundary 
lies.  Furthermore,  the  variation  of  the  fields  in  the  vicinity  of  the  shadow  boundary, 
and  hence  the  character  of  the  shadow,  are  not  clear  since  the  exponential  decay  of 
the  incident  wave  may  imply  that  its  field  is  smaller  than  the  diffracted  field  along  the 
shadow  boundary. 

The  problem  of  locating  the  shadow  boundary  has  been  previously  recognized,  ^ 
but  has  not  been  settled  conclusively.  For  a smoothly  curved  cylindrical  obstacle,  a 
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study  of  the  creeping  wave  behavior  revealed  a shift  of  the  boundary  from  its  geo- 
metric optical  location  when  the  incident  field  is  evanescent.  While  the  physical 

9 

process  associated  with  wave  propagation  in  a complex  direction  was  studied,  it  was 
not  used  to  predict  the  location  of  shadow  boundaries.  In  treating  edge  diffraction  of 
a Gaussian  beam  using  complex  rays,  Otis^^  somewhat  arbitrarily  defined  the  shadow 
boundary  by  taking  the  magnitude  of  the  complex  transverse  coordinate  of  the  complex 
ray  passing  through  the  edge,  but  did  not  investigate  the  character  of  shadow  formation. 
This  prescription  for  the  shadow  boundary  is  not  consistent  with  that  obtained  here, 
but  the  error  that  is  introduced  in  the  fields  by  its  use  is  smaller  than  the  amplitude 
of  the  diffracted  field. 

B.  Evaluation  of  the  Field  and  the  Shadow  Boundaries 

In  order  to  gain  detailed  insight  into  the  location  and  character  of  shadow  bound- 
aries for  inhomogeneous  waves  we  have  returned  to  the  problem  of  evanescent  plane 
wave  diffraction  by  a conducting  half- screen.  The  geometrical  configuration  is  as 
shown  in  Fig.  2,  where  the  angle  0^  = u + iv.  A harmonic  plane  wave,  with  time  de- 
pendence exp  (-iujt)  implied,  is  assumed  to  be  incident  in  the  x-z  plane  on  a perfectly 
conducting  half- screen  lying  in  the  half- plane  x < 0,  z = 0,  as  shown  in  Figure  2.  For 


Fig.  2.  Shadow  boundaries  for  an  inhomogeneous  plane 
wave  incident  on  a conducting  half-screen  at  a 
complex  angle  0 = u + iv  for  the  case  v > 0 . 

The  phase  paths  [10]  for  the  incident  field  is 
the  family  of  lines  parallel  to  0 = u . 
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simplicity,  the  electric  field  is  assumed  to  be  polarized  along  y.  The  variation  of  the 
electric  field  is  given  by 


Ej^^^(x,  z)  = exp  [ ik  pcos(6- 0^)]  =exp[ik.  p]  , 


(1) 


where  k = ep  is  the  wavenumber  of  the  medium  and  the  cylindrical  coordinates 
(p,  6)  are  as  indicated  in  Figure  2.  In  £q.  (1),  6^  is  the  angle  that  the  wave  vector 

k = X k sin  6 + z k cos  0 (2) 

makes  with  the  z axis.  For  an  inhomogeneous  plane  wave  in  a lossless  medium 
(k  real),  0^  = u + iv  is  complex  and  correspondingly,  k = Re  + i Im  k is  complex. 

The  vectors  Re  k and  Im  k are  perpendicular  to  one  another  as  indicated  in  Fig.  2,  and 
Im  k points  in  the  direction  of  maximum  rate  of  decay  of  the  fields. 

As  a result  of  reflection  from  the  screen  and  diffraction  at  its  edge,  the  total 
electric  field  E(p,  0),  which  includes  the  incident  plane  wave,  is  found  to  be 


E (p,  0)  = exp  [ikp  cos(0-0^)]  Q(p)  - exp  [ -ikp  cos  (0-0^)]  Q(q). 


Here 


p = e 


and 


iTr/4  /0  -e\ 

^ 2kp  sin  ( — ^ — ) 

iTT/4  /0  + 0\ 

cos  — j 

1 " 2 
Q(s)  = — / ^ 


q = e 


(3) 


(4) 


(5) 


is  one  half  the  complementary  error  function. 

1 3 

For  |s  I » 1,  we  may  approximate  Q(s)  aymptotically  as 


Q(8)' 


2^  ITS 


I(s);  Re(s)  > 0 


Q(s)~  1 + 


2j~ir8 


l(s)  ; Re(s)  < 0 


with 


1(9)  = 1 + L 

n = l 


(1  . 3.  5.  ..  2n-l) 

, ,.n  2n 
(-2)  s 


(6) 


(7) 
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The  series  in  Eq.  (7)  is  asymptotic  in  the  sense  that  the  terms  initially  decrease  in 

magnitude  for  fixed  large  s,  but  beyond  a certain  value  of  n they  increase  in  magnitude. 

While  taking  an  infinite  number  of  terms  would  cause  I(s)  and  thereby  Q(s)  to  diverge, 

a limited  number  of  terms  will  give  a good  approximation  to  Q(s).  Since  the  error  is 

1 4 

of  the  order  of  the  first  term  neglected,  the  most  accurate  sum  for  Q(s)  is  obtained 
by  cutting  off  at  a value  N such  that  the  n = N + 1 term  is  minimum.  Because  of  the 
variation  of  the  terms  with  n,  as  cited  above,  N can  be  found  from  the  condition  that 
the  n = N and  n = N + I terms  be  the  two  most  nearly  equal  in  magnitude.  This  de- 
finition of  N is  implied  in  Eq.  (7),  recognizing  that  N will  be  a function  of  the  coordi- 
nates (p,  6)  . 

Using  Eqs.  (7)  and  (8)  in  Eq.  (4),  the  total  field  can  be  separated  into  a geometri- 
cal optics  component  and  a diffracted  component  Eq* 

The  geometrical  optical  component  is 

E^Q=exp[  ik  p cos (0-0^)]  U(0- 0^^)-exp[  -ik  pco8(0+0^)]  U(0-0^^).  (8) 

Here  U (t)  is  the  unit  step  function  and  the  angles  0 . and  0 are  found  from  the  con- 

^ 81  sr 

ditions  Re  p = 0 and  Re  q = 0,  respectively.  The  first  term  in  Eq.  (8)  represents  the 
incident  plane  wave,  which  illuminates  the  region  3ir/2  > 0>  0.,  but  does  not  reach  the 
region  0 . > 0 > - it/2.  The  second  term  is  the  plane  wave  reflected  from  the  half- 
screen, and  occupies  the  region  3ir/2  > 0 > 0 . It  is  evident  from  Eq.  (8)  that  0 , 

S 2*  SI 

and  0 give  the  shadow  boundaries  of  the  incident  and  reflected  fields.  From  Eq.  (4) 
sr  ® 

and  the  conditions  cited  after  Eq.  (8)  these  boundaries  are  given  by 


0 . = u + tan  ^ (sinh 

SI 


v) 


0 


sr 


ir 


0 . 
SI 


(9) 


For  the  case  of  a homogeneous  plane  wave,  v = 0 and  u = 0^,  so  that  0^j^  = 0^ 
and  0 = ir  - 0 from  Eq.  (9)  and  the  shadow  boundaries  are  obtained  from  the  critical 

ray,  as  in  Figure  1.  When  0^  is  complex.  Re  k makes  an  angle  u with  the  z-axis,  as 
in  Figure  2.  For  v > 0,  Im  k makes  an  angle  u with  the  positive  x-axis,  as  shown  in 
Figure  2.  In  this  case  the  shadow  boundary  0 = 0^^  for  the  incident  plane  wave  lies  to 
the  right  of  the  screen  and  above  the  radial  line  0 = u.  Because  the  field  decays  in 
the  direction  of  Im  k,  and  since  0^^  > u,  the  incident  field  along  its  shadow  boundary 
decays  away  from  the  edge  and  therefore  becomes  exponentially  small  compared  to 
the  diffracted  field,  which  is  given  below.  For  v < 0,  Im  k has  a negative  x component 
and  0 . < u,  so  that  for  this  case  also,  the  incident  field  is  exponentially  small  along 

8 V 

its  shadow  boundary.  Similar  statements  apply  to  the  variation  of  the  reflected  field 
along  its  shadow  boundary  0^^. 


r 


ELECTROMAGNETICS 


The  diffracted  field  for  |p  j >>  I and  |q  | » 1 is  obtained  from  Eqs.  (3)  and  (6) 


^i  (kp+  ir/4) 


"Ayj . tiai 


For  a homogeneous  incident  plane  wave,  0^  is  real  and  Ej^  as  given  in  is  infinite 

along  the  shadow  boundaries  0=0  , ir  - 0 . This  behavior  results  from  the  fact  that 

o o 

(p  I or  |q  ( vanishes  on  the  shadow  boundary  so  that  the  asymptotic  expansion  is  not 

valid.  In  the  case  of  an  inhomogeneous  plane  wave,  0 is  complex  so  that 
/0^-0\  ° 
sinf — 2 — j cost  — ^ — J vanish  along  the  shadow  boundary.  Thus  for  kp 

sufficiently  large,  Eq.  (10)  will  be  valid  on  the  shadow  boundaries,  while  if  kp  is  small 

enough,  the  condition  |p  | » 1 or  |q  | » 1 will  not  be  satisfied  there,  and  hence  Eq.  (10) 

cannot  be  used.  In  the  next  section  we  examine,  for  0^  complex,  the  regions  in  space 

where  the  conditions  for  Eq.  (10)  to  hold  are  satisfied,  and  we  explore  the  character 

of  the  field  in  the  vicinity  of  the  shadow  boundaries. 

C.  Transition  Regions  and  the  Character  of  the  Shadow  Boundary 

Based  on  the  asymptotic  approximation  of  the  complementary  error  function,  we 
have  defined  the  shadow  boundaries  produced  by  the  edge  of  the  screen.  The  nature 
of  the  shadowing  effect  across  the  boundaries  remains  to  be  determined.  In  order 
for  the  asymptotic  approximation  leading  to  Eqs.  (8)  and  (10)  to  be  valid,  it  is  neces- 
sary that  IpI  » 1 and  (q  | » 1,  the  first  condition  being  significant  for  the  shadow 
boundary  0^^^  of  the  incident  field  and  the  second  applying  to  the  shadow  boundary 
of  the  reflected  field. 

Let  A be  a large  number  such  that  the  conditions  |p  | » 1 and  |q  | » 1 are 
satisfied  whenever  |p|  ^ and  |q  I ^ Va  . The  significance  of  this  number  will 
be  considered  subsequently.  For  the  incident  field,  the  condition  fp(  > will  be 
satisfied  at  points  (p,  0)  outside  of  a transition  region  whose  boundary  is  given  by 

p [cosh  V - cos(0  - u)]  = A/k  . (11) 

In  the  case  of  a homogeneous  plane  wave  with  v = 0,  the  curve  given  by  Eq.  (11)  is  a 

parabola  with  focus  at  the  edge  p = 0 and  centered  about  the  shadow  boundary 

0 = 0 . = 0 . 

SI  o 

In  the  case  of  an  inhomogeneous  plane  wave,  the  curve  described  by  Eq.  (11)  is 
an  elipse  with  one  focus  at  the  edge  p = 0 and  major  axis  oriented  along  the  radial 
line  0 = u,  as  indicated  in  Figure  3.  The  length  L and  width  W of  the  ellipse  are 


•••  .-U 
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The  value  of  N is  a function  of  |p(,  and  hence  of  position  (p,  6),  which  can  be 
found  by  requiring  the  n = N and  n = N + 1 terms  in  Eq.  (7),  to  be  those  two  most  near- 
ly equal  in  magnitude.  Thus  N is  the  integer  closest  to  the  number  N given  by 


Using  Eqs.  (14)  in  (13)  it  is  found  after  some  manipulation  that 


(15) 


In  view  of  Eq.  (15),  the  ellipse  given  by  Eq.  (11)  represents  a contour  of  con- 
stant  error  on  which  the  error  is  0[  l/(  vZirAe  )]  . Outside  of  the  ellipse,  the  error 
has  a lower  value  while  inside  the  error  is  greater. 


Before  considering  the  significance  of  the  shadow  boundary  we  examine  the 

error  made  in  using  the  Geometrical  Theory  of  Diffraction  (GTD)  approximation  for 
Ejj.  The  GTD  approximation  is  obtained  from  Eq.  (10)  using  I(p)  = I(q)  = 1,  so  that 


^D  “ ^GTD  " 


^i(kp+Tr  /4) 
2 JZtrkp 


(16) 


With  this  approximation  for  Ej^,  the  error  in  the  vicinity  of  the  shadow  boundary  G^. 
is  the  order  of  the  field  contribution  from  the  n = 1 term  of  I(p).  From  Eqs.  (7)  and 
(10),  it  is  seen  that 


error  (0)  = 0 


(17) 


Thus,  the  ellipse  of  Eq.  (11)  also  represents  a contour  of  constant  error  for  the  GTD 

3/2 

approximation,  Eq.  (16)  on  which  the  error  is  0[  1/(2JV A . )]  • 


The  significance  of  the  shadow  boundary  G^j^  can  be  seen  by  comparing  the  error, 
Eq.  (15),  to  the  incident  field.  After  some  manipulation  it  is  found  that 


(18) 
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Outside  of  the  ellipse  of  Eq.  (11)  with  A = 1,  the  ratio  of  Eq.(18)  is  seen  to  be  less 

than  unity  for  all  0.  For  kp  » 1,  the  dependence  of  Eq,(18)  on  0 is  dominated  by  the 

exponential  term,  which  is  greatest  when  0 = 0^^.  Thus,  error  (N)  is  most  nearly 

equal  to  |E.  I.  alone  the  radial  line  0 = 0.,  whereas  error  (N)  decreases  in  com- 

^ ' me  I'  » SI 

par  Ison  to  |E.  I as  0 deviat.e^’^hrom  0 .. . 

To  see  the  significance  of  Eq.(18),  suppose  that  we  had  arbitrarily  taken  the 

shadow  boundary  to  be  at  some  angle  0^^  other  than  that  given  by  Equation  (9).  In  this 

case  the  first  term  in  of  Eq.  (8)  would  have  as  a factor  the  unit  step  function 

U(0  - 0 .),  instead  of  U(0  - 0 .).  The  discontinuity  generated  in  the  asymptotic  ap- 
si  _ si 

proximation  by  U(0  - 0^^)  would,  in  view  of  Eq.  (18),  be  larger  in  comparison  to  error 

(N)  then  if  we  had  taken  0^^  = 0^^  In  other  words,  the  discontinuity  produced  by 

dropping  as  the  shadow  boundary  is  crossed  has  the  least  effect  on  the  overall 

error  when  the  shadow  boundary  is  taken  along  the  radial  line  0 = 0^..  Because  Ej^^^ 

is  exponentially  small  compared  to  E^^  along  the  shadow  boundary,  its  omission  is 

referred  to  in  the  mathematical  literature  on  asymptotics  as  the  Stokes  phenomenon, 

15 

and  the  shadow  boundary  is  known  as  a Stokes  line.  If  one  is  satisfied  with  less  than 
the  highest  accuracy  possible  and  uses  the  GTD  approximation  Eq.  (16),  for  the  dif- 
fracted field,  the  shadow  boundary  may  be  more  loosely  defined,  provided  that  the 
magnitude  of  the  incident  field  along  the  boundary  does  not  exceed  the  error  in  Equa- 
tion (17).  However,  the  choice  0 . = u,  as  implied  in  Ref.  10  is  not  valid  sincelE.  1 = 1. 

s i ' me ' ’ 

and  therefore  its  omission  on  one  side  of  the  radial  line  0 . = u would  result  in  an 

SI 

unacceptably  large  discontinuity. 

The  exponential  factor  in  Eq.(I8)  is  constant  on  contours  given  by 

kp[  1 - cos(0g.  - 0)]  = B,  (19) 

where  B is  some  constant.  Expression  (19)  is  the  equation  of  a parabola  about  0 = 0gj^ 
with  focus  at  the  edge.  In  the  limit  as  v - 0,  the  ellipse,  Eq.  (11),  which  is  a contour 
of  constant  absolute  error,  coincides  with  the  parabola,  Eq.  (19),  for  B = A.  For 
evanescent  fields,  the  two  contours  are  distinct. 

The  error  associated  with  the  asymptotic  approximation  of  Q(p)  in  Eq.  (3)  has 
the  same  properties  as  those  described  above  for  Q(p),  except  that  the  relevant  geo- 
metrical shapes  are  the  mirror  images  in  the  (x-y)  plane  of  those  for  Q(p),  as  in 
Figure  3.  Thus,  the  contours  of  constant  absolute  error  are  ellipses  centered  about 
the  line  0 = it  - u.  Along  the  shadow  boundary  0 = 0^^  = ir  - 0^^^,  error  (N)  for  Q(q) 
is  largest  compared  to  the  magnitude  of  the  reflected  field,  given  by  the  second  ex- 
ponential in  Equation  (10).  Thus  dropping  the  reflected  field  across  the  radial  line 
0 = 0 produces  the  least  discontinuity  in  the  total  field,  relative  to  the  error  inherent 
in  the  asymptotic  approximation. 
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D.  Summary 

This  study  of  high-frequency  evanescent  plane  wave  diffraction  by  a perfectly 
conducting  semi-infinite  plane  has  clarified  the  location  of  the  shadow  boundaries,  the 
transition  regions  surrounding  them,  and  the  mechanism  of  formation  of  the  shadow. 
The  concepts  of  illuminated  and  shadow  zones  are  meaningful  only  at  high  frequencies 
and  hence  are  tied  intimately  to  the  asymptotic  behavior  of  the  rigorously  formulated 
field.  It  has  been  found  that,  as  in  the  case  of  homogeneous  plane  wave  diffraction, 
the  high  frequency  field  can  be  separated  into  incident,  reflected,  and  edge  diffracted 
constituents.  However,  sufficiently  far  from  the  edge,  this  quasi-optic  description 
holds  for  all  observation  angles,  without  need  of  transition  functions,  since  the  transi- 
tion regions,  in  contrast  to  the  homogeneous  wave  case,  are  of  finite  extent  (see  Fig- 
ure 3).  The  reason  for  this  simplification  is  the  negligible  discontinuity  in  the  field 
values  caused  by  deletion  of  the  evanescent  incident  field  on  crossing  the  shadow 
boundary.  To  achieve  this  behavior,  the  shadow  boundary  must  be  chosen  appropri- 
ately and  related  to  the  error  inherent  in  the  asymptotic  field  representation.  For 
minimal  error,  this  has  led  to  an  identification  of  the  shadow  boundaries  with  the 
Stokes  lines  in  the  asymptotic  field  expansion. 

When  a larger  than  minimal  error  is  acceptable  in  the  asymptotic  field  formula- 
tion, as  for  example  by  retaining  only  the  leading  term  in  the  asymptotic  expansion  of 
the  edge  diffracted  field,  the  location  of  the  shadow  boundary  becomes  less  precise. 

A range  of  contours  near  the  optimal  location  exists  whenever  the  incident  field  is 
smaller  than  the  permitted  error.  Yet  it  is  important  to  observe  that  the  intuitively 
appealing  incident  phase  path  6 = u that  grazes  the  edge  does  not  meet  this  require- 
ment and  is  therefore  unacceptable.  The  same  behavior  is  found  to  occur  when  the 
incident  field  is  a Gaussian  beam.  ^ 
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RADIO- WAVE  PROPAGATION  ALONG  MIXED  PATHS  IN  FOREST 
ENVIRONMENTS 

T.  Tamir 

The  propagation  of  radio  waves  in  the  presence  of  wooded  areas  has  been  the  sub- 
ject of  intensive  studies,  but  most  investigations  have  dealt  with  communication  paths 
located  entirely  within  or  close  to  the  vegetation.  In  particular,  numerous  field- 
strength  measurements  have  recently  been  obtained  along  paths  that  connect  two  points 
inside  a forest,  but  fewer  such  measurements  were  taken  between  a point  inside  the 
vegetation  and  another  point  outside  the  wooded  area.  Similarly,  theoretical  studies 
have  concentrated  on  propagation  paths  that  lie  entirely  inside  the  vegetation  region  and 
less  attention  has  been  paid  to  paths  that  lie  partly  outside  the  forest. 

The  aim  of  the  present  report  is  to  take  a closer  look  at  the  propagation  mecha- 
nism for  mixed  paths  in  forest  environments  and  to  present  analytical  results  for  the 
relevant  field  amplitudes.  An  important  feature  of  this  work  is  to  show  that  these  ana- 
lytical results  can  be  described  in  terms  of  ray  paths  which  permit  the  calculation  of 
radio  losses  in  a large  class  of  forest  environments.  The  extension  of  this  ray  approach 
to  more  complex  situations  is  straightforward.  As  an  example,  the  ray  representation 
is  discussed  in  the  context  of  a propagation  path  that  traverses  through  two  wooded  re- 
gions which  are  separated  by  a region  containing  no  vegetation. 

1-4 

In  agreement  with  previous  work,  the  idealized  model  for  mixed- path  considera- 
tions in  a forest  environment  \s  given  by  the  configuration  shown  in  Figure  1.  A forest 

Z 

4 


Fig.  1.  Geometry  of  a forest  layer  adjacent  to 
a bare  ground  region. 


ELECTROMAGNETICS 


23 


layer  having  an  effective  tree  height  h is  shown  to  the  left  of  the  x = b plane,  with  the 

space  to  the  right  of  that  plane  being  a bare  ground  region.  Within  the  frequency  range 

of  (approximately)  2-200  MHz,  the  vegetation  layer  is  assumed  to  be  a homogeneous 

2 

refractive  medium  with  a relative  permittivity  £,  =n  , where  n is  the  average  index  of 

^ 2 
the  vegetation.  Similarly,  the  relative  permittivity  of  the  ground  is  denoted  by  = N . 

To  arrive  at  simple  field  considerations,  we  assume  that  a dipole  antenna  is 
placed  at  the  point  T = T(0,0,  z^)  where  z^  < h is  inside  the  forest.  Both  vertical  and 
horizontal  polarization  will  be  examined  but,  for  simplicity,  we  consider  only  the  field 
detected  at  a point  R = R(x,  0,z)  by  a dipole  oriented  for  maximum  reception  of  the  signal. 

An  important  feature  of  the  forest- layer  geometry  is  that,  depending  on  the  par- 
ticular location  of  the  receiving  point  R,  four  different  canonical  situations  (or  propaga- 
tion regimes)  exist,  each  of  which  is  characterized  by  a different  type  of  wave  mecha- 
nism and  therefore  by  a different  expression  for  the  electromagnetic  field.  As  indicated 
in  Fig.  1 by  Roman  numerals,  these  various  regimes  correspond  to  the  receiving 
point  being  located  as  follows:  (I)  inside  the  vegetation  layer,  (II)  in  the  air  region 
above  the  vegetation  layer,  (III)  at  a relatively  high  altitude  in  the  air  above  the  bare 
ground  region,  and  (IV)  at  a relatively  low  height  above  the  bare- ground  region. 

A.  The  Field  Inside  the  Forest  (Region  I) 

Because  the  wave  reflections  are  negligible  at  the  forest-air  interface,  the  forest 
boundary  at  x=b  may  be  neglected  for  reception  points  R(x,  0,z)  located  inside  the  veg- 
etation layer.  In  this  case,  the  forest  may  be  assumed  to  extend  also  over  the  entire 
region  x>  b,  as  shown  in  I igure  2. 


Fig.  2.  Ray  path  TABR  for  the  lateral  wave 
inside  the  forest  layer  (Region  I). 

When  the  forest  layer  covers  the  entire  ground  plane,  the  field  problem  reduces 

2-  4 

to  a situation  whose  solution  is  already  available  in  the  literature.  To  establish 
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notation,  we  repeat  here  the  pertinent  results  for  the  electric  field  E detected  at  points 
inside  the  forest  (x  jC  b,  0 < x £ h).  For  large  distances  |x|  and  a time  dependence 
exp(-iut),  which  is  suppressed,  a saddle-point  evaluation  of  E = E^  (in  Region  1)  yields 

ik[  I X I +-Jn^  - 1 (2h-z-z  )] 

_^Iae 1_  F(90°,z)  F(90°,z  ) , (1) 

n - 1 X 


where  la  is  the  dipole  moment,  k = 2'!r/\  is  the  plane-wave  propagation  factor  in  air  and 


F(e,z) 


1 + B(9.z) 

1 - B(e,h) 


(2) 


B(e,z)  = r(e)  e 


Zikz 


,,,  2 . 2..  1/2  ,.,2  . 2-.  1/2 

T/A\  M(n  - sin  0)  - m(N  - sin  0) 

. 2_  1/2  ^ . 2-1/2  • 

M(n  - sin  0)  + m(N  - sin  0) 

M = m=  1 , for  horizontal  polarization 


M = N : m = n 


for  vertical  polarization 


(3) 

(4) 

(5) 


The  field  given  by  Ej  is  that  of  a lateral  wave,  which  follows  the  path  TABR  in 
Fig.  2,  as  can  be  verified  by  noting  that  the  exponent  in  Eq.  (1)  represents  an  electrical 
length  given  by 


kn(  TA  + BR ) + k AB  = k(ns  sec  0^  + x 


s tan  0^)  = k(  I X I 


where  x = 2h-z-z^,  and  the  angle  0^  shown  in  Fig.  2 is  the  critical  angle  of  total  reflec- 
tion: 0^  = sin  ^ (l/n).  Of  course,  0^  is  generally  complex  because  n is  a complex 
quantity.  However,  for  small  losses  n^  « n^,  the  real  part  of  0^  is  predominant  and 
it  then  yields  the  physical  interpretation  of  the  ray-paths  TA  and  BR  in  Figure  2. 

The  agreement  of  measured  data  with  the  field  predicted  by  Eq.  (1)  has  been  well 

documented.^  ^ In  particular,  E^  was  found  to  decrease  with  the  distance  squared,  as 
-2  ^ 

required  by  the  x variation.  Also,  the  field  was  found  to  generally  increase  with  the 
height  z of  the  observation  point  R,  which  is  consistent  with  measured  data. 


B.  The  Field  Above  the  Tree  Tops  (Region  II) 

For  observation  points  R above  the  tree  tops  (x  £ b,  z > h),  the  extended  slab 
geometry  used  above  also  holds  because  the  forest  boundary  at  x = b has  a negligible 
effect  on  the  wave  reaching  the  receiving  dipole.  In  this  case,  a first-order  saddle- 
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point  evaluation  for  large  distances  r » h yields  a geometric- optical  result 

ik[r  + Vn^  - sin^6  (h- z^)] 

p(o) IZOtt  ilap e q 

■ 1-2 — l-\(0)B(0,h)  Xr 

m cos  6 +Vn  - sin  0 

where  r and  0 are  indicated  in  Fig.  3,  and 


X 


Fig.  3.  Ray  paths  TA'R  for  the  reflected  wave  and  TABR  for  the  lateral 
wave  above  the  tree  tops  (Region  II).  Because  of  far- field 
assumptions,  the  radius  vector  r is  essentially  parallel  to  the 
refracted  ray  A'R. 


v(e) 

p = 


jZi 

r-2 — -z 

y n - SI 


sin  0 - m cos  0 


1, 


sin  0 + m cos  0 

for  horizontal  polarization 


7 

sin  0,  for  vertical  polarization 


(7) 

(8) 


The  field  e|°^  represents  a refracted  wave,  as  shown  by  the  ray  TA'R  in  Figure  3 
The  geometric-optical  character  of  this  field  is  evidenced  by  the  spherical- wave  depend 
ence  r ^ and  by  the  exponent  in  Eq.  (6),  which  yields  an  electrical  length 

kn-T^  + k • A'R  = k[n(h-z^)sec  0'  + r - (h-z^)tan  0'  sin  0] 

= k[r  + Vn^  - 8in^0  (h-z^)]  , 


kl 

______  I 
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where  0',  shown  in  Fig.  3,  is  related  to  0 via  Snell's  condition  sin  0 = n sin  0'. 

The  field  E^°^  is,  however,  only  a partial  component  of  the  total  field  E j.  In 
particular,  it  is  noted  that  is  proportional  to  cos  0 and  therefore  vanishes  at  the 
forest- air  boundary  (z  =h,  0 =90°).  For  points  in  the  vicinity  of  0 = 90°,  a second- 
order  saddle- point  contribution  yields  another  field  component  which  is  given  by 


(1)  _ £.(o)  im  sin  0 


"U 


U 


kr 


F(0,h)  tan  0 


(9) 


sin  0 


By  comparing  Eqs.  (1)  and  (9),  one  can  verify  that  E'  is  exactly  equal  to  the  lateral 

(1) 

wave  Ej  at  all  points  along  the  air-forest  interface  (z  =h).  Hence  the  field  ' is  iden- 
tified as  the  continuation  of  the  lateral  wave  above  the  tree  tops,  as  indicated  by  the 
dashed  ray  BR  that  extends  vertically  upwards  from  the  basic  lateral- ray  path  TAB. 


is  significant  only  for 


Because  this  field  varies  as  r ^ while  varies  as  r \ ^ 

angles  0 near  90°,  in  which  case  cot  0 = cos  0 = H/r  « 1,  where  H = z-h.  Inserting 
these  approximations  into  Eq.  (9),  one  finds  that  eJ°^  and  e|j^  become  equal  in  magni- 


II 

tude  at  a height  above  the  tree  tops,  which  is  given  by 


mF(90°.h) 
ZiT  4n^  - 1 


(10) 


C.  The  Field  High  Above  the  Bare- Ground  Area  (Region  III) 

Because  the  field  at  heights  z > h + H^  above  the  tree  tops  is  given  essentially  by 
a refracted  ray,  as  shown  above,  it  follows  that  this  ray  is  responsible  for  the  field 
also  at  points  of  a ray  path  that  penetrates  into  the  region  above  the  bare  ground  area. 
This  argument  is  illustrated  in  Fig.  4,  where  the  refracted  ray  TA'R  is  extended  in 
Region  III.  Consequently,  the  field  E^^  in  this  region  is  given  simply  by 


(11) 


except  that  now  the  distance  r that  enters  into  the  expression  for  Ej^j  involves  values 
such  that  X > b. 


D.  The  Field  at  Low  Heights  Above  the  Bare- Ground  Area  (Region  IV) 

For  observation  points  R near  or  below  the  limit  ray  TCL  shown  in  Fig.  4,  the 
field  cannot  be  evaluated  by  directly  extending  a ray  obtained  from  a previous  result. 
This  happens  because  Region  IV  is  strictly  a shadow  domain  with  respect  to  geometric- 
optical  ray  contributions  from  the  transmitter  at  T;  hence  the  signal  reaches  into  this 
region  only  by  a diffraction  process.  However,  since  the  forest  layer  acts  as  a "weak" 
dielectric  (with  n=  1),  a rather  simple  argument  can  provide  a good  estimate  of  the 
field  amplitude  in  Region  IV. 
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Fig.  4,  Extension  of  the  refracted  ray  TA'R  into  the  region  high  above 
the  bare-ground  area  (Region  III).  The  limit  ray  TCL  indicates 
the  approximate  location  of  the  boundary  between  Regions  III  and 
IV. 

To  establish  this  field  evaluation,  it  is  convenient  to  invert  the  problem  by  con- 
sidering the  field  at  point  T produced  by  a dipole  located  at  point  R.  Referring  to  Fig. 
5,  the  field  in  the  vicinity  of  the  forest  edge  at  B is  then  due  essentially  to  a direct  ray 


Fig.  5.  Ray  construction  for  evaluating  the  field  at  low  heights  above  the 
bare  ground  area  (Region  IV).  The  path  of  the  signal  from  T to 
R is  obtained  by  reversing  the  directions  of  the  arrows. 

RB  and  a ground- reflected  ray  RGB.  In  the  present  case,  the  situation  of  practical 
interest  is  when  both  h and  z are  much  smaller  than  the  distance  x - b between  the  point 
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R and  the  forest  edge  at  B.  (If  x - b is  comparable  to  h and  z,  the  field  at  R is  well 
approximated  by  Ej  because  then  x = b » h.)  However,  for  x - b » h,  z,  the  two  rays 
BR  and  BGR  are  near  grazing  incidence,  i.e.,  = ®2  ~ ®2 

shown  in  Figure  5. 

The  field  at  the  forest  edge  B under  grazing-incidence  conditions  is  given  by 

r 2 r " 

^ .,,„,e  i M M z+h.  o.  kzh 

E = -il20,Tla— ^ — — - I— sm-;-  , (12) 

^o  L N - 1 o 

where  r^  is  the  average  of  r^  and  r^.  For  values  kzh/r^  « 1,  the  entire  expression  in 
the  square  brackets  of  Eq.  (12)  is  well  approximated  by  a constant.  This  means  that 
Eg  varies  as  (x-  b)"*^  exp[ik(x-  b)]  for  points  2^=^  in  the  vicinity  of  the  forest  edge  at 
B.  Such  a variation  is  identical  to  that  of  a lateral  wave  that  would  be  generated  by  the 
dipole  at  R if  the  forest  layer  extended  to  the  right  of  B so  as  to  cover  the  entire  bare 
ground  area. 

Because  the  field  at,  and  to  the  left  of,  the  forest  edge  B matches  the  field  of  a 
lateral  wave,  analytical  continuation  arguments  stipulate  that  the  field  at  any  point  T 
inside  the  forest  is  found  by  simply  extending  the  "lateral  wave"  from  B to  T.  By  noting 
the  lateral- wave  behavior  expressed  in  Eq.  (1),  and  using  reciprocity  arguments,  Eq. 
(12)  leads  to 


Ejy  = - 60  IT  i la 


1 + B(90°,z^) 
1 + B(90°,h) 


ik[x  +Vn^  - 1 (h-z  )] 


p.r  -ikzh/r  p,r  ikzh/r 

*^10  o . 2 o ' o . . . 2M 

— — e + -— — e r(e,)+i— p — 

r , r ~ c, 


r(e2) +i -p- . 


The  argument  leading  to  E^y  in  Eq.  (13),  as  illustrated  in  Fig.  5,  indicates  that 
the  mechanism  of  propagation  for  Region  IV  is  more  complex  than  that  in  the  other  re- 
gions. Specifically,  the  signal  now  arrives  at  R by  first  starting  as  a lateral  wave  that 
follows  the  forest  contour  up  to  the  edge  B.  After  that,  the  signal  continues  as  a com- 
bination of  two  geometric  optical  rays  and  thus  reaches  the  observation  point  at  R via 
a different  wave  form  than  it  had  initially. 

It  is  pertinent  to  recognize  that  the  four  regimes  I-IV  discussed  here  play  the  role 
of  canonical  forms,  or  building  blocks,  which  can  be  applied  to  construct  the  field  in 
more  complicated  situations.  As  an  example.  Fig.  6 shows  two  separate  forest  layers 
of  different  heights,  with  a bare-ground  region  in  between.  For  an  antenna  located, 
say,  at  T(0,  z^),  the  field  at  any  observation  point  R(x,  z)  may  be  found  by  using  the 


Fig.  6.  Ray  construction  for  evaluating  the  field  in  a 
situation  containing  two  forest  regions  sepa- 
rated by  a large  clearing. 

arguments  and  results  presented  here  because  the  field  above  the  bare-ground  region 
is  only  slightly  affected  by  the  presence  of  the  second  forest  (extending  along  x > b^). 
Hence  the  field  is  then  given  by  either  or  depending  on  the  height  z.  However, 

inside  the  second  forest  (i.e.  , for  x > b^  and  z < h^)  the  field  may  be  obtained  by  com- 
bining three  ray  constructions  in  tandem  along  the  lines  suggested  by  Fig.  5,  except 
that  now  point  Bj  plays  the  role  of  point  R in  Figure  5. 

Consequently,  the  field  in  Fig.  6 is  found  by  starting  with  a lateral  wave  from  T 
to  Bj,  followed  by  direct  and  reflected  rays  connecting  B^  to  after  which  B^  is  con- 
nected to  R by  another  lateral  wave.  For  composite  configurations  involving  more  than 
two  forest  regions,  the  field  can  be  similarly  tracked  from  point  to  point  by  using  die 
canonical  ray  construction  discussed  here.  More  details  on  these  and  other  related 
questions  are  given  in  a recent  publication. 

U.S.  Army  Electronics  Command  T.  Tamir 

Army  Land  Warfare  Laboratory 
Aberdeen  Proving  Ground,  Maryland 
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DIELECTRIC  GRATING  ANTENNAS  FOR  MILLIMETER  WAVES 
S.  T.  Peng  and  A.  A.  Oliner 

Recent  advances  in  the  fabrication  of  millimeter  (mm)-wave  systems  using  inte- 
grated circuit  (IC)  technology  have  stimulated  considerable  interest  in  the  development 
of  new  components  and  devices  for  signal  generation,  transmission  and  processing  in- 
volving mm -waves.  To  be  compatible  with  the  new  technology,  it  is  very  desirable  to 
have  an  antenna  structure  that  may  be  integrated  with  other  components,  so  that  the 
cost,  size  and  weight  of  a mm -wave  system  can  be  greatly  reduced.  This  study  exam- 
ines the  electromagnetic  characteristics  of  an  antenna  structure  that  is  compatible  with 
the  IC  technology  for  mm -wave  applications. 

The  antenna  structure  investigated  here  is  a dielectric  grating  in  conjunction  with 
an  open  dielectric  waveguide,  which  belongs  to  the  general  class  of  traveling  wave  an- 
tennas. Such  a composite  waveguide  structure  falls  into  the  general  class  of  periodic 
dielectric  wavegviides  because  it  may  also  serve,  with  proper  designs,  many  other  func- 
tions, such  as  filtering  and  distributed  feedback  of  electromagnetic  waves.  As  a rigor- 
ous electromagnetic  boundary  value  problem,  the  periodic  dielectric  waveguide  struc- 
ture has  been  extensively  analyzed  in  the  past,  mostly  in  the  context  of  integrated  optics 
1-4 

applications.  The  purpose  of  this  work,  however,  is  to  investigate  the  periodic  die- 

lectric waveguide  with  physical  and  structural  parameters  that  are  pertinent  to  applica- 
tions in  the  mm -wave  frequency  range  and  with  particular  attention  directed  toward  the 
radiation  characteristics  of  su'ch  a structure  as  an  antenna. 

The  requirements  on  grating  antennas  for  mm -wave  applications  differ  from  those 
for  optical  couplers  in  many  respects.  Because  of  the  longer  wavelength,  the  limita- 
tion on  the  size  of  grating  antennas  is  very  severe.  For  example,  at  the  frequency  of 
100  GHz,  an  antenna  3 inches  in  length  permits  only  about  25  wavelengths  in  free  space. 
To  radiate  almost  all  the  incoming  energy  within  25  wavelengths  (in  contrast  to  more 
than  1000  wavelengths  for  optical  couplers)  the  periodic  waveguide  requires  a very 
large  leakage  constant.  Fortunately,  the  high  dielectric  constants  of  mm -wave  mate- 
rials are  very  helpful  in  achieving  a large  leakage  constant,  but  they  may  produce 
numerical  difficulties  in  an  exact  analysis  of  the  antenna.  Also,  the  previously- 
developed  perturbation  analysis  that  was  based  on  the  small  difference  between  the 
dielectric  constants  of  the  grating  and  the  surrounding  mediiam  does  not  yield  sufficiently 
accurate  results,  and  a new  approximation  technique  has  to  be  introduced  to  obtain 
better  analytic  results  for  dielectric  grating  antennas  of  high  dielectric  constant. 

The  configuration  of  a dielectric  grating  antenna  structure  is  shown  in  Figure  1. 
Such  an  electromagnetic  structure  has  been  previously  analyzed  in  the  context  of  optical 
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Fig.  i.  Configuration  of  dielectric  grating  antennas  suitable 
for  millimeter  wave  applications. 


1 - ^ 

periodic  couplers  ‘ and  many  of  the  radiation  characteristics  of  the  structure  are 
known.  For  example,  it  was  found  that  the  radiation  is  strongest  when  the  groove 
width  is  nearly  equal  to  the  tooth  width.  For  this  reason,  throughout  this  work  we 
shall  choose  the  width  of  the  grooves  to  be  equal  to  that  of  the  teeth. 


For  a given  dielectric  material,  the  antenna  structure  is  characterized  by  the 
three  parameters:  the  thickness  of  the  uniform  film,  t£,  the  thickness  of  the  grating 
region,  t^,  and  the  period  of  the  grating,  d.  In  addition,  the  frequency  of  the  source 
is  characterized  by  the  wavelength,  X.  We  have  determined  the  effect  of  each  of  the 
four  parameters  on  the  radiation  characteristics  of  the  antenna  with  three  different 
dielectric  materials:  silicon,  = 12,  Aluminum  oxide,  = 10  and  Boronitride,  e^=  4. 

For  the  three  materials  under  consideration,  we  have  determined  that  t^  < 0.25X 
will  ensure  single  mode  operation.  The  radiation  rates  are  found  to  be  sufficiently 
large  over  a wide  range  of  radiating  angles  (from  -70°  to  70°)  for  silicon  and  aluminum 
oxide  antennas,  but  not  for  boronitride  antennas.  As  an  example,  the  leakage  constant 
of  a silicon  grating  antenna  is  shown  in  Fig.  2 for  the  range  of  periodicity  lengths  d 
such  that  only  a single  beam  radiates.  The  angle  of  radiation  of  the  beam  is  directly 
related  to  the  value  of  d/X;  on  Fig.  2 some  selected  angles  are  also  indicated.  For 
instance,  at  d/X=  0.36,  Fig.  2 shows  that  aX  » 0.  1 and  the  beam  is  radiating  in  the 
forward  direction  at  about  33  degrees  from  the  normal.  For  an  antenna  length  i = 20X, 
the  field  strength  at  the  exit-end  of  the  antenna  will  be  smaller  than  that  at  the  entrance 
end  by  a factor  e'^'*  = e"  « 0.  1.  The  means  that  most  of  the  surface -wave  energy 
can  be  radiated  within  20  wavelengths.  The  leakage  constant  a can  be  easily  adjusted 
(within  a maximum  limit)  by  changing  the  grating  thickness.  A detailed  description  of 
the  general  procedure  for  the  design  of  grating  antennas  is  in  preparation  and  will  be 
issued  as  a final  report  to  the  ECOM,  Ft.  Monmouth,  N.  J. , in  the  near  future. 
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Fig.  2.  Leakage  constant  of  silicon  dielectric  grating  antenna  vs. 
the  periodicity  length.  12,  t^=  0.  2X  and  t^  = 0.05X. 
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NOVEL  TOTAL- ABSORPTION  PHENOMENON  OF  ENERGY  INCIDENT  ON  PLANAR 
STRUCTURES 

V.  Shah  and  T.  Tamir 

We  show  that  the  energy  of  a plane  wave  incident  on  a layered  structure  can  be 
totally  absorbed  if  one  of  the  layers  has  either  dielectric  or  metallic  losses.  By  inter- 
preting this  phenomenon  in  terms  of  a leaky-wave  interaction,  we  explain  similar  total 
absorption  effects  that  have  been  reported  in  the  context  of  metallic  gratings. 

A.  Introduction 

Recent  work  on  metallic  gratings  has  shown  that  the  energy  of  an  incident  plane 
wave  can  be  totally  absorbed  by  the  ohmic  losses  of  the  metal.  In  such  cases,  the  re- 
flected and  diffracted  fields  are  suppressed,  thus  obtaining  a Brewster-type  of  phenom- 

1 z 

enon.  This  phenomenon  was  first  studied  ’ for  waves  with  parallel  polarization  (TM 
modes)  incident  on  a metal  grating  and  the  strong  absorption  was  associated  with  the 
excitation  of  surface  plasmons.  However,  it  was  later  shown  * ^ that  the  same  behavior 
can  be  exhibited  by  waves  with  perpendicular  polarization  (TE  modes)  incident  on  a 
grating  covered  by  a dielectric  layer;  in  this  case,  the  strong  absorption  was  related  to 
the  excitation  of  a surface  wave  supported  by  the  layer. 

While  the  above  deals  with  optical  phenomena  which  can  be  regarded  in  electro- 
magnetic terms,  the  complete  suppression  of  reflected  energy  by  losses  has  been  known 
for  some  time  in  acoustics.  In  particular,  a plane  wave  incident  from  a liquid  onto  a 
solid  can  be  totally  absorbed  if  the  solid  medium  is  slightly  lossy.^  This  phenomenon 
was  shown  to  be  produced  by  the  presence  of  a leaky  Rayleigh  wave,  which  also  accounts 
for  the  lateral  shift  (Schoch  displacement)  of  bounded  acoustic  beams. 

By  using  a non-periodic  electromagnetic  model,  which  is  analogous  to  the  case  of 
acoustic  waves  in  Ref.  7,  we  show  here  that  total  absorption  of  an  incident  plane  wave 
can  also  occur  in  optics.  We  then  demonstrate  that  the  strong  absorption  phenomenon 
is  due  to  a leaky  (surface)  wave  interaction  in  the  presence  of  either  metallic  or  dielec- 
tric losses.  We  therefore  find  that  this  phenomenon  is  not  necessarily  restricted  to 
metallic  gratings.  Instead,  total  absorption  at  certain  critical  angles  is  found  to  occur 
in  a wider  class  of  configurations,  of  which  metallic  gratings  are  only  a particular  case. 

B.  Plane  Wave  Reflection  by  a Layered  Lossy  Structure 

Consider  the  layered  configuration  shown  in  Fig.  1,  where  a plane  wave  having 
perpendicular  polarization  is  assumed  to  be  incident  at  an  angle  0 from  the  upper  me- 
dium. A total  of  four  media  are  involved,  which  are  characterized  by  the  dielectric 
constants  , £^  and  , Of  these,  and  are  assumed  to  be  real  and  thus  des- 

ignate lossless  media.  On  the  other  hand,  either  or  may  be  taken  as  complex  so 
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Fig.  1.  Configuration  of  the  layered  structure. 

as  to  represent  lossy  media.  The  former  will  then  refer  to  a layer  of  thickness  t^  hav- 
ing dielectric  losses  whereas  the  latter  will  represent  a metal  substrate  having  conduc- 
tion losses . 

For  the  situation  shown  and  assuming  a time  variation  exp(-iot),  the  reflectivity 
at  z = 0 for  the  incident  wave  can  be  expressed  in  the  form 


where 

A = A(k^)  = tank^^t^  . 
F=F(k^)  = tank^^tj  ' 
k = [(2irA)^  - k2]l/2 

zj  J X 

for  j = d,  a,  f or  m. 


(2) 

(3) 

(4) 


»k»  t4 
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Here  \ is  the  wavelength  in  vacuum,  k . is  the  transverse  wavenumber  in  the  j-th  me- 
dium,  while  is  the  longitudinal  wavenumber  which  is  related  to  the  incidence  angle  0 
by 

k^  = (2iT/\),^e^  sin  0 . (5) 

For  a lossless  configuration,  the  metal  substrate  implies  that  the  imaginary  part 

of  £ becomes  infinite  while  all  other  dielectric  constants  are  finite  and  real.  In  that 
m 

case,  k becomes  infinite  and  Eq.  (1)  simplifies  to 
zm 


P(k^)  = 


(Ak  ,+Fk  )k  . +i{AFk  -k  ,)k 
zf za  zd  za  zf  za 

(Ak  ,+Fk  )k  , - i(AFk  -k  ,)k 
' zf  za  zd  ' za  zf  za 


(6) 


While  k is  a real  variable  for  incident  plane  waves,  as  in  Eq.  (5),  the  function  p(k  ) in 
Eq.  (6)  can  be  extended  into  the  entire  complex  k^  plane.  Thus,  P(k^)  has  zeros  and 
poles  in  this  complex  k^  plane,  which  we  denote  by  k^^  and  respectively.  The 

latter  are  associated  with  surface  waves  (k^^p  real)  and  leaky  waves  (^j^p  complex)  that 
can  be  supported  by  the  given  configuration  (see  Reference  8). 


In  the  present  study,  our  concern  is  with  the  zeros  of  Eqs.  (1)  and  (6);  in  particular 
we  show  that  p(k^)  in  Eq.  (1)  may  have  a zero  for  real  values  of  k^  and  0,  i.e.  , a 
Brewster- type  behavior  occurs  if  the  parameters  are  properly  chosen.  For  this  pur- 
pose, we  recall  that  the  configuration  in  Fig.  1 can  support  a leaky-wave  having  small 

attenuation  along  x if  the  dielectric  constants  satisfy  £,>£,>£  . This  is  the  familiar 

8 9 u I a 

situation  of  prism  couplers,  ’ except  that  now  we  have  introduced  a metal  substrate  to 
show  the  effect  of  conduction  losses.  In  such  cases,  the  significant  leaky- wave  pole 
occurs  at 


k = p + i O'  , (7) 

where  o « p . A strong  wave  interaction  is  then  known  to  occur  if  a beam  is  incident 
at  an  angle  0 so  as  to  satisfy  the  phase- match  condition  p = k^,  where  k^  is  given  by 
Equation  (5). 

For  lossless  structures,  the  form  of  Eq.  (6)  suggests  that,  for  values  of  k close 

7 * 

to  P , p (k  ) can  be  approximated  by 


p(kx)  = r(k^; 


(8) 


where  r(k^)  is  slowly  varying.  In  particular,  for  sufficiently  small  leakage,  i.e., 
a «.  p,  as  assumed  here,  = R is  constant  for  k^  close  to  p . Furthermore,  for 
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lossless  cases,  it  is  obvious  that  the  magnitude  of  p(h^)  must  be  unity  for  plane-wave 
incidence  because  the  metal  substrate  is  perfectly  reflecting  and  no  absorption  is  assum- 
ed to  occur  anywhere.  It  is  then  easy  to  verify^'  ^ that  (R  ( = 1 and  k = k*  , where 


the  asterisk  in  k^^  denotes  the  complex  conjugate  value. 


xp 


If  small  losses  are  introduced,  the  locations  of  both  the  pole  (k^p)  and  the  zero 


(kxo)  will  change,  so  that  we  may  write  Eq.  (8)  in  the  form 


|p{k  )|  = 


k - k (6) 
X xo 


k - k (5) 
X xp 


(9) 


where  6 denotes  a small- loss  variable,  which  will  be  defined  later.  However,  it  is 

important  to  note  that  agreement  with  the  previous  discussion.  Also, 

by  using  calculations  based  on  the  exact  expression,  Eq.  (1),  we  have  found  that  the 

approximations  involved  in  Eqs.  (8)  and  (9)  are  extremely  accurate  for  the  small  losses 

that  occur  in  actual  practice.  This  is  in  agreement  with  previous  results  obtained  in 
7 

acoustics,  as  well  as  with  a similar  approximation  that  was  used  in  the  context  of  me- 
tallic gratings.^  ^ 

Consider  now  the  \ariatioi.  in  the  complex  plane  of  both  k (6)  and  k (6)  as  a 

xo  xp  ' 

function  of  6,  as  shown  for  a typical  case  in  Figure  2.  In  this  case,  it  was  assumed  that 
the  substrate  is  perfectly  conducting  and  losses  occur  only  in  the  layer  of  thickness  t^, 
which  now  has  a complex  dielectric  constant  given  by 


= £^(1  + i tan  6j) 


(10) 


Here  cl  is  real  and  tan  6,  is  the  dielectric  loss  factor.  The  variation  of  k (6,)  and 
1 f xo  f 

k^p(6p  can  be  easily  found  by  using  a perturbation  solution  of  Eq.  (1),  which  was  veri- 
fied to  be  very  accurate  by  means  of  computer  calculations.  However,  these  details 
are  omitted  here.  In  the  lossless  limit  (tan  6^  = 0),  the  pole  and  zero  are  in  conjugate 
locations,  but  they  both  move  upward  in  the  complex  k^  plane  as  6^  increases.  In 

particular,  at  some  critical  value  6 , of  6-,  the  locus  of  k (6.)  intersects  the  real  k 

cf  f'  xo'  f'  X 

axis  at  This  implies  that  an  incidence  angle  0^^  exists  which  satisfies  Eq.  (5),  so 

that 


= (2Tr/\)^  sin 


(11) 


It  therefore  follows  that  p(k  ) = p(p  ,)  vanishes  if  6,  = 6 - and  0=0  ,,  i.e.,  reflection 

X CX  iCX  cx 

is  entirely  suppressed. 

Because  of  the  leaky-wave  nature  of  the  situation,  the  total  absorption  phenomenon 
can  be  attributed  entirely  to  the  strong  interaction  that  occurs  when  the  incident  wave 
is  phase-matched  to  the  leakage  angle.  This  is  therefore  the  same  situation  that  may 
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Fig.  2.  Location  of  the  leaky-wave  pole  and  its  associated 
zero  in  the  complex  plane,  as  a function  of  6f, 
for  €(5=4,  e^  = I,  E£  = 3(1  +i  tan  6f),  ta  = 0.  33X,  and 
t£  = 0.  375\.  The  metal  substrate  is  assumed  to  be 
a perfect  conductor. 

7 

occur  at  the  interface  between  a liquid  and  a solid  in  acoustics,  or  in  the  prism-coupler 

9 

configuration  in  optics.  As  the  angle  0 is  scanned  about  6^^,  the  reflectivity  varies 
as  shown  in  Fig.  3,  which  is  quite  similar  to  Fig.  1 3 of  Reference  7.  It  is  noted  that 
the  half-power  points  for  the  reflected  field  determine  a very  narrow  range  of  strong 
absorption,  so  that  | p(k  ) | recovers  quickly  to  unity  not  far  from  6 . 

X Cl 

If  losses  occur  in  the  metal  substrate  rather  than  in  the  dielectric,  we  take 


€ = 1 + i cot 

m 


(12) 


where  now  tan  6^  = oje^/cr  represents  the  conduction  loss  factor.  By  using  the  same 

analytic  approach  as  in  the  case  of  dielectric  losses,  we  find  that  the  loci  of  the  pole 

and  zero  are  now  given  by  Figure  4.  Again,  the  locus  of  k^^  intersects  the  real  k^  axis 

at  0 for  a critical  value  6 of  6 . As  in  the  case  of  dielectric  losses.  Fig.  5 now 

^cm  cm  m ' ® 


Fig.  3.  Variation  of  the  reflectivity  | p(kx)  | as  a function 
of  the  incidence  angle  6 for  the  same  parameters 
as  in  Figure  2, 


-?0i- 


Fig.  4.  Location  of  the  leaky-wave  pole  and  its  associated 
zero  in  the  complex  kx  plane,  as  a function  of  6ni. 
The  parameters  are  ^e  same  as  in  Fig.  2,  except 
that  now  Ef  = 3 and  the  metal  substrate  is  lossy,  with 
Em  = 1 +i  cot  6m* 
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shows  that  the  variation  of  the  reflectivity  p(k^)  around  the  corresponding  angle  6^^  is 

such  that  p(k  ) vanishes  at  0 but  recovers  to  unity  not  far  from  that  critical  angle. 

X cm 


IH 


Fig.  5.  Variation  of  the  reflectivity  | p(kjj)  | as  a function 
of  the  incidence  angle  0 for  the  same  parameters 
as  in  Figure  4. 

C.  Discussion 

The  behavior  displayed  in  Figs.  2 to  5 is  typical  of  the  situation  encountered  when 
a plane  wave  is  incident  on  a leaky-wave  structure  satisfying  the  small  leakage  condition 
a « P . Thus,  this  behavior  holds  for  TM  modes  as  well  as  for  the  TE  modes  con- 
sidered here;  by  extension,  p(k^)  can  vanish  for  parameters  e^,  e^,  e^,  t^  and  t^  that 
may  be  different  from  those  chosen  here.  Also,  more  than  four  media  may  be  involved 
and  losses  may  occur  in  more  than  one  of  them.  However,  in  all  cases,  the  complete 
suppression  of  the  reflected  wave  would  occur  only  if  both  conditions  6 = 6^  and  0=0^ 
are  satisfied  simultaneously. 

On  the  other  hand,  the  leaky-wave  configuration  need  not  be  restricted  to  layered 
media.  In  fact,  the  metallic  gratings  already  studied  in  this  context^  ^ are  but  partic- 
ular cases  of  leaky-wave  structures.  Specifically,  the  conduction  losses  of  metallic 
gratings  are  needed  to  provide  the  variation  discussed  after  Eq.  (12),  while  the  presence 
of  grooves  and  their  periodicity  is  needed  to  provide  a diffraction  order  to  which  an 
incident  wave  can  be  phase- matched.  Phrased  differently,  these  metallic  gratings  are 
structures  that  support  waves  of  the  leaky  variety.  As  such,  their  total  absorption 
properties  follow  the  same  criteria  that  have  been  presented  here,  provided  only  that 
they  can  support  a leaky  wave  that  satisfies  the  slow  leakage  condition  o <<  p. 

Of  course,  the  mathematical  analysis  of  periodic  structures,  in  general,  and  of 
metallic  gratings,  in  particular,  is  considerably  more  complicated  than  the  exposition 
given  here.  However,  the  basic  leaky-wave  feature,  which  is  common  to  both  layered 


^ ELECTROMAGNETICS 

media  and  periodic  structures,  has  already  been  studied  and  demonstrated  by  Hessel 
and  Oliner  in  the  context  of  Wood  anomalies,  and  by  Tamir  and  Bertoni^  in  the  con- 
text of  bounded- beam  displacements.  The  reader  is  therefore  referred  to  their  work 

for  further  details  and  to  Ref.  7 for  a discussion  of  total  absorption  in  the  analogous 

% 

acoustic  case. 

D.  Conclusion 

We  have  found  that  a simple  layered  configuration  can  exhibit  a Brewster-type 
phenomenon  whereby  the  energy  of  an  incident  plane  wave  is  totally  absorbed  by  the 
ohmic  losses  in  the  media,  so  that  the  reflected  field  is  entirely  suppressed.  Viewed 
as  a model  for  a broader  class  of  leaky-wave  structures,  this  configuration  explains 
the  suppression  of  fields  reflected  (or  diffracted)  by  metallic  gratings  with  conduction 
losses.  The  mechanism  for  this  effect  can  then  be  explained  in  terms  of  the  strong 
interaction  that  occurs  when  the  incident  wave  is  phase-matched  to  the  leaky-wave  field 
that  may  be  supported  by  either  layered  media  or  by  periodic  structures. 

Consistent  with  the  leaky-wave  interpretation  presented  here,  we  find  that  total 
absorption  of  the  incident  energy  may  be  due  to  either  metallic  or  dielectric  losses  and 
it  may  occur  in  both  TE  and  TM  modes.  Furthermore,  we  note  that  the  leaky-wave 
behavior  discussed  in  electromagnetic  terms  here  covers  not  only  effects  that  have  been 
observed  along  optical  gratings,  but  is  analogous  to  phenomena  that  have  been  observed 
and  studied  in  acoustics. 
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SHEAR  WAVE  PROPAGATION  IN  SYMMETRIC  HEXOGONAL  HONEYCOMB  CORES 
H.  L.  Bertoni  and  S.  K.  Park 

Honeycomb  panels,  consisting  of  a hexagonal  core  structure  bonded  to  metal 
face  plates,  are  strong  lightweight  construction  elements.  These  elements  are  used 
in  fabricating  containers  for  housing  electronic  equipment,  in  airplane  construction, 
and  other  applications. 

The  core  consists  of  an  array  of  thin-walled,  hollow  cylinders  of  hexagonal  cross 
section,  as  shown  in  Fig.  1,  and  resembles  a honeycomb.  The  core  material  consists 
of  waxed  cardboard  in  the  case  of  equipment  containers,  while  in  airplane  construction 
it  is  usually  metal.  Face  plates  are  attached  to  both  ends  of  the  core  using  epoxy  or 
other  adhesive  bonding  materials  (one  of  the  face  plates  is  shown  in  Figure  1). 


FACE  plate 


Fig.  1.  Honeycomb  core  attached  to  one  face  plate. 

During  fabrication  and  use,  defects  can  occur  within  the  panel,  and  are  there- 
fore not  detectable  by  visual  inspection.  One  type  of  defect  is  a region  of  a face  plate 
that  is  not  bonded  to  the  core.  Another  defect  is  the  presence  of  water  in  some  of  the 
cells,  which  can  degrade  a cardboard  core  and  weaken  the  epoxy  or  adhesive  bond 
between  core  and  face  plate.  Other  defects  are  buckled  or  fractured  cell  walls. 

Currently,  a-.oustic  non-destructive  evaluation  (NDE)  techniques  are  used  to 
locate  the  defects  cited  above.  An  acoustic  beam  in  a fluid  is  directed  at  normal  in- 
cidence on  the  panel  and  received  at  the  opposite  face.  The  beam  is  then  scanned  in 
a raster  fashion  over  the  area  of  the  panel.  Unbounded  regions  interrupt  the  trans- 
mission while  water  in  cells  increases  the  transmission.  Since  the  size  of  the  acous- 
tic beam  is  on  the  order  of  the  cell  diameter,  the  raster  scan  inspection  procedure  is 
very  lengthy  for  a large  panel.  A great  reduction  in  inspection  time  could  be  achieved 
by  propagating  the  acoustic  waves  parallel  to  the  face  plates,  rather  than  perpendicular 
to  them.  In  this  way,  one  would  inspect  an  entire  strip  along  the  panel  at  one  time. 
Reinspecting  installed  panels  with  only  one  face  exposed  would  require  propagation 
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along  the  panel. 

Low  frequency  studies  of  propagation  along  honeycomb  panels^  have  been  carried 
out  with  the  aim  of  NDE  inspection  in  mind.  However,  it  was  recognized  that  success- 
ful NDE  inspection  would  require  the  use  of  high  frequencies  for  which  the  acoustic 
wavelengths  would  be  on  the  order  of  the  cell  diameter.  Establishment  of  a sensitive 
test  procedure  requires  a knowledge  of  the  propagation  characteristics  of  the  modes 
guided  by  honeycomb  panels.  From  such  knowledge,  one  can  determine  the  frequency 
and  mode  polarization  most  useful  for  inspection.  These  choices  must  be  made  in 
relation  to  the  geometric  design  and  elastic  properties  of  materials  used  in  the  panel. 

As  a first  step  in  the  analysis  of  the  modes  guided  by  honeycomb  panels,  we 
have  computed  the  properties  of  waves  propagating  transverse  to  the  cell  axis  in  a 
honeycomb  of  infinite  extent.  The  wave  polarization  is  chosen  such  that  the  particle 
motion  is  parallel  to  the  axis  of  the  comb  cells.  For  this  polarization  and  direction 
of  propagation  the  waves  in  the  cell  walls  are  shear  waves,  which  do  not  excite  the 
Lamb  or  flexural  waves  at  the  Y joints  of  the  cell  walls,  as  discussed  below.  This 
polarization  is,  as  a result,  the  simplest  to  analyze. 

A.  Equivalent  Network  Description  of  the  Honeycomb 

The  cross-section  of  a regular  hexagonal  honeycomb  is  shown  in  Figure  2.  The 
core  consists  of  strips  of  thickness  t,  width  w and  infinite  length  along  z (out  of  plane 
of  the  paper).  The  strips  are  arranged  in  a regular  hexagonal  pattern.  For  realistic 
honeycombs,  t < < w,  so  that  the  strips  are  thin  compared  to  wavelength,  even  for 
frequencies  where  the  periodic  nature  of  the  core  is  significant.  With  the  assumption 
that  t is  small  compared  to  wavelength,  the  elastic  fields  in  the  strips  are  those  as- 
sociated with  the  three  lowest  modes  of  a plate,  namely  the  Lamb  mode  (L),  S-H 
plate  mode  (S)  and  the  flexual  mode  (F).  In  addition,  near  the  joints  between  the 

strips  the  next  higher  flexual  mode  F.,  which  is  evanescent  away  from  the  junction, 

2 * 

will  be  significant.  For  t small,  the  other  evanescent  plate  modes  will  not  be 
strongly  excited.  Finally,  thin  plate  or  plane  strain  approximations  can  be  used  for 
the  modes.  ^ 

A unit  cell  of  the  periodic  core  is  indicated  by  the  dashed  curve  in  Figure  2.  The 
entire  periodic  core  can  be  generated  by  translations  of  the  unit  cell  along  the  basis 
vectors  a^  and  a^,  shown  in  Fig.  2,  of  the  periodic  structure.  The  cell  is  seen  to  be 
composed  of  plate  segments  coupled  at  two  Y joints.  A primary  part  of  the  study  of 
core  waves  therefore  involves  the  coupling  of  the  plate  waves  at  the  Y joint. 

I In  order  to  satisfy  the  boundary  conditions  at  the  Y joint,  it  is  necessary  that  all 

plate  waves  have  the  same  phase  dependence  exp(-jk  z)  along  z.  The  variation  of  the 
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UNIT  CELL 


Fig.  2.  Cross-section  of  regular  hexagonal  honeycomb 
core.  Unit  cell  is  indicated  by  dashed  ellipse. 

5-9 

plate  waves  transverse  to  z can  be  represented  by  equivalent  transmission  lines. 

For  propagation  in  the  x-y  plane,  = 0.  In  this  case  an  S-H  mode  the  plate,  whose 
particle  motion  is  along  z,  will  not  couple  to  the  Lamb  or  flexural  modes  at  the  Y 
joint,  since  their  particle  motion  is  perpendicular  to  z. 

For  the  lowest  S-H  mode  of  a plate,  the  particle  velocity  and  stress  have  no 
variation  through  the  thickness  of  the  plate.  For  propagation  perpendicular  to  z,  the 
particle  velocity  has  only  the  component  v^  along  z,  whereas  the  force  or  stress 
across  a surface  perpendicular  to  the  direction  of  propagation  has  only  the  component 
T a.loi}g  z.  Propagation  of  this  S-H  mode  can  be  modeled  by  a transmission  line 
whose  voltage  V and  current  I represent  the  particle  velocity  v^  and  stress  via 
the  relations.  ® 


Here  p is  the  mass  density  and  p is  the  shear  modulus.  The  impedance  of  the  trans- 
mission time  is 


and  its  propagation  wavenumber  is 

k.  = • '5' 

In  solving  for  the  coupling  properties  of  the  Y junction,  all  of  the  evanescent 
S-H  waves  are  neglected.  This  approximation  is  consistent  with  the  thin  plate,  or 
plane  strain,  approximation.  By  neglecting  these  higher  modes  we  introduce  a small 
phase  error  in  the  waves  excited  at  the  junction.  This  phase  error  will  cause  a slight 
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shift  in  the  stop-band  frequencies  of  the  periodic  structure,  which  for  NDE  applica- 
tions is  not  significant. 

We  next  assume  that  the  triangular  cylinder  coupling  the  plates  at  the  Y junction 
(see  Fig.  3)  behaves  with  the  following  properties: 

(1)  the  cross-section  remains  undistorted; 

(2)  inertial  forces  on  the  cylinder  are  negligible. 

The  approximations  involved  in  these  assumptions  are  justified  by  the  fact  that  the 
width  of  the  sides  of  the  triangular  cylinder  are  equal  to  the  plate  thickness  t.  In  the 
thin  plate  approximation,  the  plate  thickness  does  not  change  so  that  the  triangle  can- 
not distort.  Also,  the  inertia  of  the  mass  in  the  triangular  cylinder  is  less  than  one 
half  that  in  a plate  segment  of  width  equal  to  the  plate  thickness. 


With  the  foregoing  assumptions,  the  dynamic  conditions  at  the  faces  of  the  tri- 
angular cylinder  are; 

(1)  the  forces  on  the  three  sides  of  the  triangle  must  sum  to  zero; 

(2)  the  particle  velocities  at  the  center  of  each  side  of  the  triangle  must  be 
equal. 

Since  the  force  T is  measured  by  current,  and  the  particle  velocity  v by  volt- 
8 9^^  2 

age,  ’ condition  (1)  above  requires  that  the  sum  of  the  currents  be  zero,  while  con- 
dition (2)  requires  that  the  voltages  be  equal.  The  Y junction  is  then  represented  by 
the  star  connection  of  transmission  lines  shown  in  Figure  4, 

Knowing  the  network  representation  of  each  Y junction,  the  equivalent  network 
for  a unit  cell  can  be  obtained.  This  network  is  drawn  in  Fig.  5 and  is  seen  to  consist 
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Fig.  4.  Network  for  Y junction  of  three  plates  representing 
S inodes  coupling  at  normal  incidence  0). 

of  four  ports  at  which  the  voltages  and  currents  Vj^,  I.  {i  = l-4)  are  defined.  The  four 
currents  can  be  related  via  an  impedance  matrix.  Because  of  the  symmetry  of  the 
network,  this  relationship  takes  the  form 


’ll’ 


^11 

^12 

^13 

^13 

^12 

^11 

^13 

^13 

^13 

^13 

^11 

^12 

_^I3 

^13 

^12 

and  Z 

are  given  by 

^11  j^s 


1-10  tan^e+  7 tan^e 
6 tan  6 (Ztan^  0-  1) 


^13  j^s 


1-5  tan^e 


6 COS  0tan0(2tan  0-1) 

1 

6 cos^0tan  0(2tan^  0-  1) 


0 = k w/2  . 

s 


We  now  consider  the  terminal  voltages  and  currents  and  to  be  associated 
with  a wave  traveling  through  the  core.  Let 


k = X k + V.  k 
— — o X V 


(9) 


be  the  wave  vector  of  this  wave.  Also,  let  a^^  and  be  the  basic  translation  vectors 
of  the  periodic  structure,  as  shown  in  Fig.  2,  and  given  by 


(10) 


3 ^ J3 

a,  T = -r  w X + — w v 

—1,2  2 — o — 2 -‘-o 

The  Floquet  assumption  for  periodic  structures  then  implies  the  following  relation 
between  the  terminal  voltages  and  currents 


V3  = Vi  e 


-J]i  • a , 


I3  =-lj  e 


-Jll  • £1 


[11) 


e"-’—  ■ —2 

I =.i  e‘-’--2 

4 2 


The  minus  sign  before  Ij  and  1^  is  due  to  the  fact  the  currents  are  assumed  positive 
into  the  unit  cell. 

Substituting  Eq.  (11)  into  Eq.  (4)  one  obtaines  four  linear  homogeneous  equa- 
tions in  terms  of  the  four  unknowns  Vj,  V^,  Ij  and  I^.  Defining 
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+3  = j 

the  four  equations  may  be  written  as 


^11  -*1^3 

^2-Vi3 

-1 

0 

’^l' 

^n-*2^3 

0 

-1 

'2 

^13-  *1^11 

^I3''^2^12 

0 

^13-  *1^12 

^13-Vn 

0 

1 

fS) 

1 

^^2 

(12) 


(13) 


In  order  for  Eq.  (13)  to  have  a non-trivial  solution,  the  determinant  of  the  4x4 
matrix  must  vanish.  Setting  the  determinant  equal  to  zero  gives  the  dispersion 
equation  of  the  form 


'13 


- zj)  cos^  6 


6 cos  y + (Z 


11 


-^13)= 


where 


6 = '4-  wk 

2 y 

3 ■ 

y = 2 -kx 


(14) 


(15) 


This  equation  may  then  be  solved  numerically  for  its  root  k as  a function  of  k 

X y 

and  or  k as  a function  of  k and  w . The  root  corresponds  to  a wave  in  the  core 
y X 

whose  polarization  is  found  from  the  eigen  vector  associated  with  the  root. 

Because  of  the  periodicity  of  the  honeycomb,  Eq.  (13)  will  have  multiple  solu- 
tions. Let  and  u^  be  reciprocal  lattice  vectors 

(16) 


-1.  2 


2 ff  . 2 IT 

3w  ^ 


so  that 


u.  . a.  = Zir6  . ■ 

-i  -j  ij 


(17) 


where  6 ..  is  the  Kronecker  delta.  Thus,  if  a particular  wavevector  k satisfies 
Eq.  (14),  then  for  any  integers  m and  n the  wavevector  k + m u j + n u ^ will  also 
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satisfy  Equation!  13).  Also,  due  to  the  hexagonal  symmetry  of  the  honeycomb,  a plot  of  k 
versus  for  fixed will  have  six-fold  symmetry  about  the  origin. 

Numerical  computations  have  been  carried  out  for  the  dispersion  curves  k 
versus  k^  for  various  fixed  values  of  to  . Because  the  S-H  wave  velocity  is  independ- 
ent of  the  wall  thickness  t,  and  because  the  impedances  of  all  transmission  lines  in 
Fig.  5 are  the  same,  the  results  are  independent  of  t.  In  addition,  if  the  normalized 
frequency  variable  0 in  Eq.  (8)  is  used,  and  the  wave  numbers  k^  and  k are  multiplied 
by  w/2,  the  computations  will  apply  to  all  materials  and  honeycombs.  ^ 


Fig.  6.  Reciprocal  lattice  points  and  slowness  curves  for: 

a)  frequency  parameter  e«  0.  6155;  b)  0 just  below 
0.  61  55;  and  c)  0 in  the  range  0.  61  55  < 0 < tT/4  = 0.  7854. 

Due  to  the  symmetry  and  periodicity  of  the  honeycomb,  the  first  quadrant  in  the 
k^-ky  plane  of  the  first  Brillonin  zone  is  more  than  adequate  to  present  the  numerical 
results  for  the  dispersion  curves  of  k^  versus  k^.  However,  in  order  to  explain  the 
variation  of  the  dispersion  curves  with  frequency  we  have  sketched  the  curves  for 
several  Brillonin  zones  in  Figures  6(a),  (b),  (c).  In  Fig.  6(a)  we  have  indicated 
several  reciprocal  lattice  points  by  dots  and  the  vectors  u.  and  u^.  The  dashed  hexa- 
m represents  the  first  Brillonin  zone.  For  low  frequencies  (0«  0.6155),  the  dis- 
• urve  is  very  nearly  a circle  of  radius  |k|  -y  = ^ 0^.  This  value  of  |k| 
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corresponds  to  a wave  velocity  for  the  Block  wave  that  is  1 /fz  times  shear  wave 
velocity  of  the  honeycomb  material.  Because  of  the  periodicity, the  near-circles 
centered  about  each  reciprocal  lattice  point  are  all  loci  of  solutions  of  the  dispersion 
relation,  Eq.  (14),  as  discussed  after  Equation  (17). 

As  the  frequency  variable  9 increase  to  just  below  0.  61  55,  the  near-circular 
dispersion  curves  expand  and  distort  into  nearly  hexagonal  shapes,  as  indicated  in 
Figure  6(b).  At  0 = 0.6155,  the  tips  of  the  hexagons  touch.  For  O.6155<0<-^  = O.  78  54, 
the  dispersion  curves  switch  to  being  nearly  triangular  and  centered  about  the 
apexes  of  the  dashed  hexagon.  As  0 approaches  it'/4,  the  dispersion  curves  shrink  to 
the  points  at  the  apexes  of  the  hexagon.  For  0 increasing  above  tt/4,  the  change  in 
the  dispersion  curves  is  reversed  from  that  described  above  until  for  0 approaching 
tt/Z,  the  curves  become  nearly  circular  as  in  Figure  6(a).  For  0 increasing  above 
ir/Z,  the  variation  of  the  dispersion  curves  described  above  is  repeated  periodically 
with  period  n/2. 

The  actual  dispersion  curves  in  the  first  quadrant  of  the  first  Brillouin  zone 
are  plotted  to  scale  in  Fig.  7,  and  are  seen  to  be  in  agreement  with  the  discussion  of 
Figure  6.  One  of  the  features  of  these  curves  that  may  important  for  NDE  is  the  fact 
that  in  certain  frequency  ranges  the  waves  are  cut-off  along +.x  , and  the  four  other 
directions  making  angles  of  60°  to  the  +x  axis.  These  cut-off  directions  are  ones 
parallel  to  a set  of  cell  walls.  However,  the  waves  are  never  cut-off  along  y,  and 
the  six-fold  related  axes. 


Fig.  7. 


Slowness  curves  in  the  first  quadrant  of  the  first  Brillouin 
zone  for  several  values  of  the  frequency  parameter  0 , 
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In  order  to  show  the  cut-off  behavior  for  propagation  along  x,  we  have  plotted 
as  a function  of  w for  the  case  k =0.  This  plot  is  shown  in  Fig.  8,  where  k^  is 
real  for  0 < 6 < 0.  61  55  and  , 9553  > 0 > ir/Z  . In  the  range  0.  61  55  < 6 < 0.  9553  , k 
is  complex  with  real  part  given  by  The  curves  are  symmetric  about 

0 = Tt/4.  For  0 outside  the  range  plotted,  the  curves  repeat  with  period  tt/Z. 


4 Re  Kj  y 


Fig,  8.  Dispersion  curve  for  propagation  along  x(ky  = 0).  In  the  range 
0.  6155<  0<  0,  95  kx  is  complex  with  real  part  given  by 
Re  (k^"2  ) “ ir/3.  Outside  this  range  k^  is  real. 

C.  Summary 

In  order  to  develop  acoustic  NDE  procedures  to  find  defects  in  honeycomb 
panels  it  is  necessary  to  understand  the  propagation  of  waves  through  the  honeycomb, 
especially  in  the  range  where  the  cell  diameter  is  on  the  order  of  the  wavelength. 

We  have  developed  an  equivalent  network  approach  from  which  the  propagation 
characteristics  can  be  found.  This  approach  was  used  to  compute  the  propagation 
characteristics  for  shear  waves  propagating  transverse  to  the  cell  axis  and  polarized 
with  particle  velocity  parallel  to  the  cell  axis.  At  low  frequencies  the  wave  propaga- 
tion was  nearly  isotropic  in  the  plane  perpendicular  to  the  cell  axis  with  phase  velocity 
times  lower  than  the  shear  wave  velocity  of  the  honeycomb  material.  At  higher 
frequencies,  the  periodicity  of  the  honeycomb  becomes  significant  and  the  propaga- 
tion is  an  isotropic  with  six-fold  symmetry.  At  high  enough  frequencies,  no  propaga- 
tion takes  place  in  directions  parallel  to  each  of  the  sets  of  cell  walls.  However,  in 
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the  orthogonal  directions  propagation  takes  place  for  all  frequencies. 
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OBSERVATION  OF  THE  LEAKY- WAVE  FIELD  IN  AN  ACOUSTICALLY- EXCITED 
LAYERED  STRUCTURE 

D.A.  Davids  and  J.  O'Brien 
A.  Introduction 

The  existence  of  a leaky-wave  has  been  used  to  explain  the  observed  lateral  dis- 
placement of  an  acoustic  beam  reflected  from  a solid  substrate/  the  operation  of  acous- 

2 

tic  wedge  transducers,  and  as  a design  principle  in  a prism  device  for  nondestructive 

3 

evaluation.  Although  the  beam  displacement  and  the  concomitant  null  region  in  the  re- 

4 

fleeted  beam  have  been  observed  and  documented,  the  existence  and  predicted  function- 
al dependence  of  the  leaky- wave  has  not  been  given  an  explicit  experimental  verification. 

In  order  to  provide  a clearer  test  of  the  predicted  properties  of  the  leaky-wave, 
an  experiment  was  designed  in  which  a wedge  transducer  is  used  to  launch  on  the  sur- 
face of  a polished  aluminum  substrate  a Rayleigh  wave  which  is  then  made  to  impinge 
on  a layered  structure  consisting  of  a glass  slab  of  dense  flint  separated  from  the  sub- 
strate by  a fluid  film  of  controlled  thickness.  Refer  to  Figure  1.  The  upper  surface 


RUBBER  CEMENT 

Fig.  1.  System  used  to  observe  leaky-wave  characteristics 
by  means  of  a heterodyne-type  laser  probe. 

of  the  glass  slab  is  of  optical  quality  and  is  aluminized  to  provide  a mirror  surface. 

The  displacement  of  this  upper  surface  at  the  acoustic  excitation  frequency  is  then 
measured  by  means  of  a heterodyne  laser  probe  the  operation  of  which  is  described  by 

5 

Whitman  and  Korpel.  The  complete  structure  is  located  on  an  x-y  positioner  which  en- 
ables a determination  of  the  acoustic  excitation  as  a function  of  either  lateral  or  trans- 
verse coordinates.  The  thickness  of  the  fluid-film  is  established  by  plaotic  shims  that 
are  located  well  outside  of  the  region  of  the  acoustic  beam.  The  slab  is  terminated  in 
a reservoir  filled  with  a viscous  damping  fluid  which  approximately  "index  matches" 
the  acoustic  wave  in  the  glass,  the  function  of  which  is  to  absorb  energy  from  the 
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acoustic  wave  and  thus  reduce  reflections  and  the  consequent  standing  waves.  Unwant- 
ed surface-wave  reflections  on  the  aluminum  substrate  were  reduced  to  a negligible 
point  by  rounding  the  edges  of  the  substrate  and  engraving  the  underside  with  a random 
pattern  of  grooves  that  were  filled  with  rubber  cement. 

A number  of  transducer  wedge  configurations  were  observed  in  order  to  obtain  a 
Rayleigh  beam  of  nearly  Gaussian  profile.  Care  was  taken  to  ensure  that  the  widths  of  I 

wedges,  substrates,  and  slabs  were  sufficient  that  diffraction  effects  were  negligible.  j 

B.  Procedure 

Continuous  Rayleigh  waves  of  f = 1 MHz  were  launched  by  means  of  a wedge  trans- 
ducer coupled  to  the  aluminum  substrate  by  a l/2  mm  water  film.  The  Rayleigh  beam  j 

was  observed  and  confirmed  to  be  well  collimated  and  reasonably  close  to  Gaussian  in 
profile.  During  all  experimental  measurements  the  damping  reservoir  was  filled  with 
Dow  Corning  type  200  silicone  fluid  of  1000  Cs  viscosity  which  appeared  to  most  dra- 
matically reduce  the  standing  acoustic  waves  observed  at  the  upper  surface  of  the  glass  ■ j 

slab.  Water  film  thickness  under  the  glass  slab  as  determined  by  plastic  shim  stock  j 

was  varied  over  the  range  of  approximately  1/8  mm  to  1mm.  Acoustic  displacement  I j 

of  f = 1 MHz  was  measured  as  a function  of  the  longitudinal  coordinate  for  each  of  the  , j 

reported  values  of  fluid  film  thickness.  Since  the  observations  required  for  a single  j j 

fluid  film  thickness  involved  several  hours  of  measurement,  the  acoustic  power  levels  ! j 

were  kept  low  (<  100  mw  electrical  power  into  the  transducer)  so  that  thermal  effects  , | 

in  the  PZT  transducer  and  its  fluid  coupling  film  would  not  produce  long-term  effects  ; | 

in  the  acoustic  excitation.  This  excitation  level  produced  peak  acoustic  displacements  j j 

o . ' * 

at  the  upper  surface  of  the  glass  slab  in  the  range  of  0.  1 to  5 A which,  unfortunately, 
restricted  the  range  over  which  the  decay  could  be  observed.  During  each  experimental 
run  previously  measured  points  were  constantly  rechecked  to  ensure  long-term  repeata- 
bility. Where  standing  waves  were  insignificant,  data  points  were  plotted  on  semi-log 
paper  and  exponential  decay  constants  determined.  Where  standing  waves  were  present, 
data  points  were  averaged  over  one  period  and  the  "sliding-average"  value  plotted  semi- 
logarithmically.  This  sliding  average  produced  considerable  data  smoothing  of  the 
4-5  data  points  per  acoustic  wavelength,  while  introducing  little  error  since  in  the  cases 
where  this  technique  was  used,  the  decay  length  was  large  compared  with  an  acoustic 
wavelength.  Figure  2 illustrates  cases  where  standing  waves  are  either  absent  or 
present,  and  in  the  latter  case  the  effect  of  the  sliding  average  on  the  measured  data. 

C.  Results 

The  results  of  the  measurements  are  illustrated  in  Fig.  3,  where  the  predictions 
of  Hou  and  Bertoni  appear  as  solid  lines.  Their  calculations  were  for  a steel  rather 
than  an  aluminum  substrate  but  this  should  make  little  difference  in  the  result.  We 
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LONGITUDINAL  DISTANCE- INCHES 


LONGITUDINAL  DISTANCE  - INCHES 


A.  Measured  harmonic  acoustic  dis- 
placement in  the  presence  of 
standing  waves  (worst  case)  for 
.020"  coupling  film  thickness. 


B.  Sliding  average  over  one  acoustic 
wavelength.  The  phase  constant 
P of  the  leak/  wave  used  in 
averaging  was  measured  to  be 
p = 3.55  cm"  ^ . 


©AVERAGE  DATA  FOR  .020" 
COUPLING  LAYER 
X UNSMOOTHED  DATA  FOR  .030“ 
COUPLING  LAYER 


0.1  02  0.3  0.4  0.5 

LONGITUDINAL  DISTANCE  - INCHES 


C.  Semilug  plot  of  averaged  displace- 
ment for  above  case  denoted  b/ 

O The  line  corresponds  to  a 
decay  constant  of  o =0.88  cm*  ^ . 
Also  indicated  is  unsmoothed  data 
for  .030"  coupling  layer.  The 
line  represents  a decay  constant 
of  o = 1.31  cm’  I . 


Fig.  2 
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Fig.  3.  Comparison  of  experimental  measurements  for  water  layer 

between  aluminum  and  dense  flint  glass  compared  with  the  predic- 
tions of  Hou  and  Bertoni^  for  a water  layer  between  steel  and  dense 
flint  glass  (solid  line).  The  dotted  curve  represents  a scaled  ver- 
sion of  the  predicted  result  (scale  factor  x 200).  Both  decay  con- 
stant a and  water  film  thickness  H are  normalized  to  the  acoustic 
wave  number  in  water,  k^. 

observe  that  our  decay  constants  are  two  hundred  times  the  decay  constants  predicted 
by  the  theory.  For  comparison  we  have  multiplied  the  Hou  and  Bertoni  predicted  values 
by  approximately  200  and  plotted  the  result  as  a dotted  line.  The  functional  dependence 
on  film  thickness  is  then  see  seen  to  be  quite  good.  The  phase  constant  of  the 
leaky-wave  on  the  surface  of  the  flint-glass  slab  was  observed  to  be  P = 3.55cm 

D.  Conclusions 


Under  the  conditions  detailed  above,  we  have  observed  exponentially  decaying 
leaky-wave  fields  with  decay  constants  which  differ  by  a large  factor  from  the  values 
numerically  predicted  by  Hou  and  Bertoni,  but,  which  when  a^  propriately  scaled,  show 
a remarkable  agreement  with  the  predicted  functional  dependence  on  film  thickness. 

At  this  time  we  have  not  as  yet  been  able  to  identify  the  source  of  the  scaling  discrep- 
ancy. 
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LEAKY  SAW  AND  GUIDED  LIGHT  BRAGG  INTERACTION  IN  LiNl.03 
F.  Freyre  and  W-C.  Wang 

The  results  of  a theoretical  analysis  of  Bragg  interaction  of  optically  guided  light 
and  leaky  acoustic  surface  waves  are  presented.  These  results  are  compared  to  those 
of  a similar  analysis  for  YZ  Rayleigh  waves.  For  TM  optical  modes,  X propagating 
leaky  waves  on  metallized  64°  rotated  Y cut  LiNbO^  interact  nearly  twice  as  strongly 
at  the  surface  as  the  free  surface  Rayleigh  waves.  However,  X propagating  leaky  waves 
on  free  surface  41°  rotated  Y cut  LiNbO^  have  near  zero  interaction  for  both  TM  and 
TE  modes . 


A.  Introduction 

Leaky  waves  on  rotated  Y cut  LiNbO,,  were  described  by  Yamanouchi  and 

17  o 

Shibayama.^’  They  found  that  X traveling  waves  on  41  rotated  Y cut  free  surface 

LiNbO^  and  64°  Y cut  metalUzed  LiNbO^  have  zero  decay  constants.  Thus  they  are 

true  surface  waves  with  no  bulk  wave  components.  They  also  found  that  the  waves  have 

large  electro-mechanical  coupling  coefficients.  In  our  analysis  of  the  Bragg  acousto- 

g 

optic  interactions  of  optical  surface  waves  with  these  waves  (and  of  Rayleigh  waves) 

we  shall  assume  the  existence  of  a thin  optical  waveguide  on  the  surface  of  the  LiNbO,. 

3 6 ^ 

This  guide  will  have  been  fabricated  by  metal  ion  indiffusion  or  Li20  outdiffusion 
with  a resulting  tapered  refractive  index  profile.  In  some  cases  we  shall  consider  the 
surface  to  be  thinly  metallized.  We  shall  also  assume  that  the  diffusion  process  shall 
not  have  significantly  affected  the  elasto-optic  and  electro-optic  coefficients  of  the 
LiNbOj. 

Figure  1 shows  the  relative  orientation  of  the  crystal,  light  and  SAW. 


DIFFRACTED 


Fig.  1.  Top  surface  of  crystal  showing  relative  directions  of 
guided  lightwaves  and  leaky  surface  waves. 
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B.  Acoustic  Analysis 

For  TE^  and  TM  light  incident  at  the  Bragg  angle,  the  Bragg  diffraction  efficien- 
cy, is  approximately  given  by 

, r L 

. 2 nm 

C = sin  (1) 

* cos  ' 

where  L is  the  acoustic  beam  width  and  F is  the  coupling  constant  for  mode  m to  mode 

nm  ^ ® 

n diffracted  light.  6^  is  the  Bragg  angle.  The  expression  for  the  coupling  constant  is 

3 / j = i m 6 

r = — j = 1 to  3 (2) 

° / Mn(y)dy 

o 

1 = 1 for  the  X traveling  leaky  waves  and  TE  modes,  3 for  the  Z traveling  Rayleigh 
waves  and  TE  modes,  and  2 for  TM  modes  with  the  Rayleigh  or  leaky  waves. 

^re  the  optical  electric  field  distributions  for  TE  modes  and  the  optical  magnetic 
field  distributions  for  TM  modes,  n is  the  refractive  index.  is  the  free  space  wave- 
length of  the  light.  Pjjt  r^.  represent  elasto-optic  and  electro- optic  tensor  elements, 

respectively,  in  the  crystal  cut  coordinate  system.  S.(y)  is  the  j strain  component  of 
A th  ^ 

the  SAW.  Ej  is  the  j electric  field  component  of  the  SAW.  For  TM  modes  we  have 
assumed  E^  is  much  greater  than  E in  the  light  propagation  direction  which  is  general- 
ly true. 

We  define  the  Overlap  Factor  (O.F.)  as  follows; 


J = 1 to  6 
j = 1,2,3 


Subscript  I was  defined  above.  C is  a factor  which  will  equalize  the  total  power 
flow  per  unit  beam  width  (acoustic  and  electric)  for  each  SAW  assuming  equal  wave- 
number.  We  arbitrarily  let  C = 1 for  the  YZ  free  surface  Rayleigh  Wave  (see  Table  I) 

TABLE  I.  Values  of  C Which  Equalize  Power 
Carried  by  Each  Surface  Wave. 


Surface  Wave 

Total  Power  (watts) 

C 

41®  F.S.  Leaky 

10.95X  10^®  A^  LK^ 

.0068 

64°  Met.  Leaky 

8.69x  10^^  A^  LK^ 

. 24 

YZ  F.S.  Rayleigh 

S.OOx  10^“*  A^  LK^ 

1. 00 

YZ  Met.  Rayleigh 

2.  92x  10^^  A^  LK^ 

1.31 

r 
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and  adjust  C for  other  surface  waves  to  get  the  same  power.  K . is  the  acoustic  wave- 
number  and  all  waves  are  assumed  to  have  the  same  wavenumber.  The  Z factor  in- 
dicates that  the  O.F.  is  a time  average.  A is  the  peak  amplitude  of  the  surface  wave. 

C.  Overlap  Factor  Results 

Figures  2,  3 and  4 show  the  overlap  factors  for  the  64°  leaky  waves,  free  surface 
Rayleigh  waves,  and  metallized  surface  Rayleigh  waves,  respectively.  These  curves 
represent  the  sum  of  the  electro-optic  and  elasto-optic  contributions.  The  64°  leaky 
wave  - TM  interaction  is  noteworthy  in  that  its  magnitude  exceeds  all  others  at  the  crys- 
tal surface.  The  41°  leaky  wave  overlap  factors  are  very  small  and  therefore  not 
shown.  A thin  guide  which  propagates  TM  modes  and  keeps  most  of  the  optical  energy 
near  the  crystal  surface  should  exhibit  faster  depletion  of  the  incident  optical  wave  with 
the  64°  leaky  wave  than  with  any  of  the  other  acoustic  waves  considered. 


Fig.  2.  64  Leaky  wave  X propagating: 

I TE  modes 

II  TM  modes 

III  TM  modes  - 90°  out  of  phase  with  II. 

D.  Coupling  Coefficient  Calculations 

Using  the  previous  results  we  investigated  coupling  coefficient  behavior  as  a func- 
tion of  surface  wave  type  and  optical  waveguide  effective  thickness.  We  assumed  a 
typical  graded  index  profile  for  the  optical  waveguide.  The  profile  was  approximately 


.aM.. 
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Fig.  3.  YZ  Rayleigh  wave  - free  surface: 

I TE  modes 

II  TM  modes 

III  TM  modes  - 90°  out  of  phase  with  II. 


Fig.  4.  YZ  Rayleigh  waves  - Metallized  surface: 

I TE  modes 

II  TM  modes 

III  TM  modes  - 90°  out  of  phase  with  II. 


i 


ACOUSTICS 


61 


complementary  error  function  in  shape  and  therefore  similar  to  profiles  reportedly  at- 
tained by  metal  indiffusion?  Then  using  a piecewise  linear  approximation  and  using 

4 

Airy  function  solutions  we  calculated  for  TE  propagating  modes,  E(y)  and  for  TM 
propagating  modes,  H{y).  Figure  5 shows  TE  0 and  TM  1 waveforms  for  metallized 


METAL  f CRYSTAL 

L 

Fig.  5.  Electric  and  magnetic  field  distributions  (not  normalized) 
for  typical  planar  metallized  graded  index  waveguide. 

guides.  Using  these  and  free  surface  waveforms  and  the  overlap  factor  curves  we  cal- 
culate I rgQ  I foT  TE  free  surface  and  metallized  guides  and  TM  free  surface  guides 
and  I j I for  TM  metallized  guides.  We  "compressed"  or  "stretched"  E(y)  and  H(y) 
to  imitate  the  effects  of  different  effective  waveguide  thicknesses  and  used  a numerical 
integration  method  for  the  calculations. 

Table  I shows  that  the  largest  coupling  coefficients  are  obtained  with  the  64°  leaky 
wave  and  TM  modes  and  very  thin  guides,  and  for  the  YZ  free  surface  Rayleigh  wave 
and  TE  modes,  also  with  a thin  guide.  For  thick  guides  (thi-  ker  than  4 microns)  we 
see  that  Rayleigh-Metallized- TM,  Rayleigh-free  surface- TM  and  TE  combinations  are 
best.  These  results  are  summarized  in  Table  II. 

In  conclusion,  we  have  found  through  a theoretical  analysis  that  the  64°  rotated 
Y cut  leaky  surface  wave  offers  the  strongest  acousto-optic  interaction  kni>wn  to  date. 
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TABLE  II.  Coupling  coefficients  vs.  effective  waveguide  thickness. 


Surface 

Acoustic  Wave 

Coupling 

Coefficient 

Component 

Optical 

Mode 

Mode 

Number 

(m) 

Effective  Waveguide 
Thickness  T (Microns)* 

1 

4 

7 

10 

64°  Leaky- Metallized 

Real 

TM 

1 

3.34 

1. 

0 

.4 

64°  Leaky- Metallized 

Imag 

TM 

1 

1.25 

. 35 

0 

.2 

Rayleigh- Metallized 

Real 

TM 

1 

.9 

.3 

.05 

. 1 

Rayleigh-  Metallized 

Imag 

TM 

1 

.4 

.65 

.75 

.75 

Rayleigh- free  surface 

Real 

TM 

0 

1.5 

.65 

.3 

. 15 

Rayleigh-free  surface 

Imag 

TM 

0 

. 1 

.4 

.5 

.6 

Rayleigh- free  surface 

Real 

TE 

0 

2.  1 

1.6 

1.1 

. 7 

64°  Leaky- Metallized 

Real 

TE 

0 

.5 

.4 

.3 

. 2 

Rayleigh- Metallized 

Real 

TE 

0 

.4 

.55 

. 35 

.2 

T = \ r 1.58x10  — ; K.  = 3.57x10  m where  Pis  the 

o ' mm ' PK , A 

A 

power  in  watts,  L the  acoustic  beam  width  in  meters  and 

K.  = 3.57x10^  m 
A 
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OPTICAL  IMAGING  AND  MEMORY  USING  A SILICON  DIODE  STORAGE  CORRELATOR 
W-C.  Waag,  L.S.  Rosenheck  and  K.  C . Whang 

The  optical  characteristics  (in  visible  and  infrared  frequencies)  and  the  memory 
characteristics  of  a silicon  diode  storage  correlator  have  been  under  study  for  the  past 
few  years.  It  has  been  shown  experimentally  that  the  storage  time  is  strongly  depend- 
ent on  light  intensity  and  that  the  storage  time  in  the  dark  at  room  temperature  is  about 
40  seconds.  We  found  the  storage  and  writing  time  constants  to  be  in  general  agree- 
ment with  the  measured  p-n  junction  characteristics.  In  addition  we  have  found  that 
the  amplitude  of  the  correlation  signal,  after  repetitive  writing,  increases  to  a peak 
and  then  decreases  with  time. 

A.  Description  of  the  Writing  and  Readout  Process 

1-4 

Various  types  of  acoustoelectric  storage  correlators  have  been  reported.  The 
diode  array  type  appears  to  be  the  most  promisirg.  Although  significant  inroads  have 
been  made  in  understanding  the  basic  principles  of  the  storage  process,  due  to  the 
structural  complexity  of  the  diode  mosaic,  a detailed  interpretation  of  the  observed 
experimental  results  is  still  not  complete. 

Figure  1 is  the  experimental  configuration.  On  the  LiNbO^  substrate,  a pair  of 


Pulser 


II •Output 


Fig.  1.  Schematic  of  the  experimental  configuration. 

Center  frequencies  of  A and  B 50  MHz,  and 
R 100  MHz. 

50  MHz  and  one  100  MHz  transducers  are  deposited  at  their  respective  locations  A,  B 
and  R.  RCA  4532  vidicon  diodes  are  placed  on  the  LiNbO^  substrate  separated  by  an 
airgap  of  = 1000.^.  As  depicted  in  Fig.  1 the  diode  mosaic  is  placed  such  that  the 
diodes  are  facing  the  LiNbOj  substrate.  The  reverse  side  of  the  mosaic  is  coated  with 
a metal  and  the  output  of  the  storage  correlator  is  taken  across  the  Si  and  LiNbO^. 
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The  writing  process  is  achieved  by  applying  two  r.f.  pulsed  signals  of  frequency  ui  at 
transducers  A and  B.  Through  nonlinear  interaction  of  the  two  oppositely  propagating 
waves,  a signal  which  is  time  independent  but  with  a spatial  variation  cos(2kz)  is 
produced. 

We  can  represent  the  charge  storage  process  as  follows:  The  two  oppositely  prop- 
agating acoustic  waves  can  be  represented  as  = S sin{ajt-kz)  and  = W sin(cjt+kz) 
respectively.  The  superposition  of  these  two  waves  generates  a stationary  wave  pat- 
tern 


Ej  + E^  = E = S sin(ojt-kz)  + W sin(u)t+  kz) 

which  can  in  turn  be  represented  as  follows 

E = (S  + W)cos  kz  sinujt-  (S-W)sinkzcosujt 


a 


l(S  + 2SW  + W ) cos^kz  + (S^  - 2 SW  + W^)  sin^kz  lsin(u)t  + <j>(z)) 


‘} 


(S^  + W^)(cos^kz  + sin^kz)  + 2 SW(cos^kz  - sin^kz)  )sin(uit+ <j)(z)) 


whei 


4.(^)  =-tan-^  (S--  W)^inkz 
' (S  + W)coskz 

E = (S^  + + 2 SW  cos  2 kz)^/^  • sin(cjt  + <()(z)) 


If  we  represent  the  charging  circuit  as  follows 


Fig.  2.  Circuit  representation  of  the  charging  process. 

we  note  that  the  air  gap  capacitor  couples  the  superposed  signal  to  the  semiconductor. 
The  positive  r.f.  cycles  will  quickly  charge  the  two  capacitors  so  that  one  can  assume 
that  peak  detection  is  occurring. 

We  can  get  a stored  charge  profile 
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Q(z)  = q(z)  cos  2 kz 

with  a ir/K  periodicity.  The  stored  charge  will  in  turn  reverse  bias  the  diode  system. 

The  stored  charge  profile  can  now  be  read  out  by  applying  a signal  R(t)  cos(2  ut - 
2kz).  If  R(t)  * 6(t)  then  is  approximately  the  correlation  of  S and  W.  It  should  be 

noted,  however,  that  a time  scaling  factor  of  one-half  is  introduced  in  this  type 
of  reading  and  writing  process.  This  time  scaling  factor  can  be  removed  by  using  a 
layered  structure  BGO- Si- diode- LiNbOj  as  pointed  out  in  a previous  paper.^ 

Output  signals  are  observed  as  shown  in  Fig.  3 where  the  triangular  shaped  signal, 
indicated  by  I,  corresponds  to  the  convolution  output  between  two  rectangular  writing 
pulses,  both  of  2/isec  duration  and  20  volts  (p-p,  across  a 50  load)  in  amplitude. 

The  other  pulse  of  large  amplitude  corresponds  to  the  applied  readout  pulse  indicated 
through  the  air.  The  correlation  output,  designated  at  position  II,  appears  after  the 
readout  pulse. 


2u  »ec/Dlv. 

Fig.  3.  Signal  outputs  . 

I - Convolution 
II  - Correlation 

If  at  time  t^  both  of  the  writing  pulses  are  suddenly  switched  off,  but  the  readout 
pulse  still  remains,  the  correlation  output  will  first  rise  to  a peak  and  then  decrease 
with  time  to  zero  as  indicated  in  Fig.  4,  at  different  light  levels. 

The  storage  time  in  total  darkness  is  =25  seconds.  One  can  further  note  that 
the  storage  time  is  strongly  dependent  on  light  intensity.  Similar  results  are  obtained 
using  an  infrared  source  as  indicated  in  Figure  5. 

A reason  for  the  dependence  of  the  storage  time  on  light  intensity  can  be  hypoth- 
esized in  the  following  manner.  The  stored  charge  is  discharged  through  the  reversed 
biased  leakage  current  of  the  diode  array,  which  in  turn  is  dependent  on  the  minority 
carrier  lifetime.  As  the  light  intensity  increases,  the  minority  carrier  lifetime  de- 
creases causing  the  reversed  bias  current  to  increase  and  in  turn  the  stored  charge 
leaks  faster.  Figure  6 is  the  measured  diode  characteristic  which  was  obtained  from 
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4vjw/cm^  IZOyw/cm^ 


5sec/Div, 

Fig.  4.  Correlation  output  as  a function  of  time  with 
light  intensities  (visible)  as  a parameter. 
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Fig.  5.  Correlation  output  as  a function  of  time  with 
light  intensities  (infrared)  as  a parameter. 

a diode  mosaic  with  ohmic  contacts  which  are  deposited  over  a surface  area  of  (Z.Smm)^ 
A 30  pf  capacitance  is  measured  in  darkness.  The  RC  time  constant  is  = 20  sec  which 
shows  good  agreement  with  storage  time.  Judging  from  the  sensitivity  of  the  leakage 
current  to  very  low  light  levels,  it  is  of  no  surprise  that  the  storage  time  varies  at 
very  low  light  levels. 

The  observed  amplitude  profile  of  the  correlation  output  (as  a function  of  time) 
can  perhaps  be  comprehended  in  the  following  manner.  Considering  the  geometry  and 
physical  dimensions  of  the  diode  arrays,  the  diode  actually  only  occupies  a small  portion 
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Fig.  6.  Reverse  characteristics  of  the  diode. 

of  the  surface  area  l/8)  the  rest  of  the  surface  area  is  occupied  by  an  MOS  structure. 
In  a previous  paper  ^ we  pointed  out  that  the  convolution  output  in  such  a structure  can 
actually  be  predicted  by  the  model  of  a reversely  biased  MOS  structure.  Here  the 
diode  serves  as  a carrier  injection  and  the  injected  carriers  which  are  the  stored 
charges  reverse  bias  both  the  diode  and  the  MOS.  When  the  stored  charge  decreases 
with  time,  so  does  the  reverse  bias.  Since  the  nonlinear  processes  of  convolution  and 
correlation  are  quite  similar,  (except  that  the  stored  charge  to  be  correlated  with  is 
also  decreasing  with  time),  the  correlation  amplitude  profile  may  exhibit  a bump  as  in 
the  case  of  the  convolution  in  a reverse  biased  diode  mosaic.  The  disappearance  of  the 
bump  when  the  writing  amplitude  is  decreased,  would  be  due  to  the  simple  reason  that 
the  reversed  bias  is  too  low  to  cause  a bump.  A similar  reason  can  be  given  when  the 
light  intensity  is  increased,  the  charge  storage  is  decreased  and  the  bump  disappears 
as  shown  in  Figures  4 and  5. 

The  amplitude  of  the  correlation  output  as  observed  in  Figs.  3 and  4,  is  a function 

of  light  intensity.  In  Fig.  7 we  plot  both  the  amplitudes  of  convolution  and  correlation 

versus  light  intensity.  It  is  noted  that  the  correlation  output  varies  linearly  through  a 

“6  / 2 

large  range  of  the  variation  of  light  intensity  from  low  levels  (4x10  mw/cm  ) to 
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The  theoretical  model  for  the  physical  processes  involved  with  the  p-n  junction 
diode  correlator  is  under  investigation.  This  model  will  be  based  on  the  following 
assumptions 

(1)  Carrier  injection  occurs  only  in  the  direction  of  the  surface  normal. 

(2)  Diffusion  of  the  injected  carriers  is  only  one-dimensional  in  the  direction 
of  the  surface  wave,  x,  and  confined  to  the  region  between  two  adjacent 
diodes . 

(3)  The  injection  in  the  y-direction  is  much  faster  than  the  x-directed  diffusion- 
drift  process.  This  allows  us  to  consider  the  two  processes  as  occurring 
sequentially . 

(4)  The  recombination  time  of  the  minority  carriers  is  related  to  the  leakage 
current,  therefore,  the  storage  time. 

We  first  establish  the  initial  conditions  at  time  t = 0 for  the  device  in  Figure  1. 

A t = 0^,  a displacement  current  pulse  D of  width  T is  applied  to  transducers  A and  B. 
Using  the  continuity  equation  with  Poisson's  equation  the  charge  distribution  is  found 
at  t = T.  The  charge  distributes  through  diffusion  and  drift  and  recombines  through 
leakage  across  the  p-n  junction.  The  spatial  Fourier  components  of  the  charge  distri- 
bution are  calculated  as  a function  of  time.  After  this  is  completed  we  hope  to  develop 
a comprehensive  circuit  model  which  will  simulate  the  charge  storage  process. 
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CHARGE  STORAGE  IN  A DIODE  MOSAIC  INDUCED  BY  SURFACE  ACOUSTIC  WAVES 
W-C.  Wang,  L.S.  Rosenheck,  K.C.  Whang  and  H.  Schachter 

Research  on  storage  correlators  has  been  an  ongoing  project  by  various  groups 

1-4 

for  the  past  few  years.  Although  these  investigators  have  proposed  various  models 
to  represent  the  charge  storage  process,  to  date  a completely  cohesive  theory  for  the 
storage  mechanism  has  not  emerged.  Reasons  for  this  are  twofold:  first,  different 
writing  processes  have  been  employed  and  second,  the  geometry  of  the  diodes  and  the 
storage  devices  are  different.  These  devices  range  from  Schottky  diodes  to  p-n  junc- 
tion diodes  with  neighboring  MOS  capacitors. 

To  facilitate  the  understanding  of  the  charge  storage  process  we  conducted  an 
experiment  to  monitor  the  total  charge  in  the  system  due  to  a single  sonic  pulse,  mul- 
tiple sonic  pulses,  and  contra-directional  sonic  pulses.  From  the  experimental  results 
we  also  learned  that  a transverse  acoustoelectric  effect  which  accompanies  the  surface 
wave  will  contribute  during  the  charging  process.  The  acoustoelectric  effect  plays  a 
role  in  forward  and  reverse  biasing  the  p-n  junction  diodes. 

Figure  1 is  the  experimental  configuration. 


STORAGE 

SCOPE 


Fig.  1.  Experimental  set-up. 

We  detect  our  slowly  time  varying  output  by  first  entering  a unity  gain  op  amp 

1 2 

with  a high  input  impedance,  > 10  D,  and  a slew  rate  of  loi^/ixsec.  and  the  op  amp  out- 
put is  then  monitored  by  a storage  scope.  The  op  amp  acts  to  make  the  output  of  the 
storage  correlator  load  independent  and  the  storage  scope  is  necessary  to  monitor  the 
charge  build-up  due  to  just  a single  SAW  pulse.  On  the  LiNbO^  substrate  a pair  of  50 
MHz  transducers  are  deposited  at  locations  1 and  2.  An  RCA  4532  vidicon  diode  mosaic 
is  placed  above  the  LiNbO^  substrate  supported  by  an  airgap.  The  diode  mosaic  is 
manufactured  by  first  using  an  n type  Si  wafer  of  50  D - cm  resistivity.  A thin  SiO^ 
layer  with  = 5 fxm  diameter  holes  is  first  formed  on  the  Si  surface.  Boron  impurities 
are  then  diffused  into  the  hole  site  to  form  the  p-n  junction.  The  periodicity  of  the 
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diode  array  is  = 12/im.  The  diodes  are  then  overlayed  with  a p^-Si  pad.  An  antireflec- 
tion coating  (SiO)  of  several  K thick  is  also  deposited  at  the  back  of  the  Si  wafer.  The 
total  thickness  of  the  Si  wafer  is  = 12jUm, 

We  are  interested  in  the  charge  storage  which  is  due  to  a single  acoustic  pulse 
applied  to  just  one  transducer.  Figures  2(a)  and  (b)  show  typical  results  of  the  output 
signals  generated  by  successive  r.f.  pulses  applied  to  transducer  1.  The  difference 
between  the  figures  is  that  in  Fig.  2(a)  the  pulse  width  is  13^sec.  whereas  in  Fig.  2(b) 
the  pulse  width  is  6 psec.  In  each  of  the  oscillograms  we  labelled  the  first  5 traces 
1-5.  Trace  1 was  obtained  when  we  applied  a pulse  to  transducer  1 with  the  system  un- 
charged. Each  of  the  successive  pulses  is  applied  at  .2  second  intervals  and  each 
respective  output  is  indicated  by  traces  2 through  5.  ■ ' 


Fig.  2.  Output  signal  generated  by  repetitive 

single  pulse  writing.  The  time  interval 
between  successive  pulses  is  0.2  sec. 

An  individual  trace  can  be  understood  in  the  following  manner: 

(1)  When  the  pulse  is  first  applied  to  transducer  1 but  prior  to  the  prop>aga- 
tion  of  the  pulse  under  the  semi-conductor  no  voltage  is  observed. 

(2)  As  the  pulse  begins  to  propagate  under  the  semi-conductor  the  output 
voltage  increases  rapidly  until  the  acoustic  pulse  is  completely  under 
the  semi-conductor  at  which  point  the  output  voltage  increases  but  at  a 
much  slower  rate. 

(3)  As  the  pulse  emerges  from  under  the  semi-conductor  the  output  voltage 
decreases  until  the  pulse  has  completely  exited  from  under  the  semi- 
conductor. The  output  voltage  is  then  just  the  charge  storage.  We 
see  that  as  we  apply  successive  pulses  the  next  trace  starts  at  about 
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the  previous  charge  storage  level,  however,  the  increment  increase  in 
output  voltage  is  less  on  each  successive  pulse. 

In  Fig.  2(a)  it  can  be  seen  that  after  the  fifth  pulse  was  applied  the  successive 
pulses  contributed  little  to  the  charge  storage  level.  The  observed  variation  in  the  out- 
put voltage  is  then  totally  due  to  the  transverse  d.c.  acoustoelectric  effect.  To  indicate 
the  d.c.  acoustoelectric  effect  more  clearly,  in  Fig.  3 we  compare  th,e  output  signal 
due  to  just  a single  writing  and  that  which  occurs  in  the  same  system  after  many  repet- 
itive writings.  For  the  latter  case  the  d.c.  acoustoelectric  effect  is  symmetric  and 
flat. 


OUTPUT  SIGNAL 

Fig.  3.  Output  signals  after  repetitive  writings  in  the 
upper  case  and  after  a single  writing  in  the  1 
lower  case . 

In  order  to  clearly  indicate  the  charging  process  vs.  time  we  substracted  the  d.c. 
acoustoelectric  effect  from  the  output  voltage  profile.  The  plot  in  Fig.  4,  charge  storage 
as  a function  of  time,  is  obtained  from  trace  1 of  Figure  2(a).  From  the  curve  we  know 
that  the  build-up  of  charge  storage  as  a function  of  time  is  not  a simple  exponential. 

In  fact  the  curve  can  be  represented  by  an  equation  of  the  following  form: 

-t/r,  -t/T 

K3  - (K^e  - Kje  ^ 

where  t2  = 4. 2x10  ^secs.  and  Tj  = 3.5x10  ^ seconds 
K3  = 2 = 5.8  and  Kj=3.9. 

An  explanation  of  the  charging  process  is  now  presented. 

Because  the  diodes  are  isolated  from  each  other  only  those  diodes  directly  above 
the  acoustic  wave  are  charged.  The  respective  diode  is  charged  to  or  near  its  full 
charge  value  within  a few  r.f.  cycles,  but,  the  remaining  diodes  act  as  a load.  As 
more  diodes  become  charged  they  can  be  considered  as  voltage  sources  loaded  by  a 
capacitive  and  resistive  load.  For  example,  consicfer  the  case  of  1,000  diodes  in  the 
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Fig.  4.  Charge  storage  (single  writing)  as  a function 
of  time.  Pulse  width  13  fisec.  , pulse  height 
15  volts  p-p. 

direction  of  SAW  propagation.  If  initially  the  first  diode  is  charged  to  a voltage  V and 
the  remaining  diodes  are  uncharged  then  the  observed  output  voltage  is  V/lOOO.  If  n 
diodes  are  charged  the  output  is  nV/lOOO.  Thus,  the  output  voltage  represents  a meas 
ure  of  the  total  charged  stored  in  the  system.  If  one  considers  the  storage  mechanism 
for  an  individual  diode,  it  can  be  represented  by  a simple  voltage  source  applied  to  a 
diode  and  a capacitor. 

To  further  illustrate  the  accumulation  of  charge  in  the  system  consider  the  effect 
of  applying  pulses  simultaneously  to  transducers  1 and  2.  Using  the  same  structure 
as  in  Fig.  2 we  obtained  the  oscillogram  of  Fig.  5 where  each  trace  starts  at  0 volts. 

In  the  case  of  a single  r.f.  pulse  applied  to  transducer  1 or  a single  r.f.  pulse  applied 
to  transducer  2 the  output  voltages  from  the  respective  traces  ((a)  and  (b))  are  nearly 
the  same.  For  the  case  of  the  pulses  applied  simultaneously  to  transducers  1 and  2 
we  have  a result  which  is  indicated  by  trace  (c).  Notice  that  trace  (c)  is  approximately 
the  additive  sum  of  trace  (a)  and  trace  (b)  which  confirms  our  explanation. 
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Fig.  5.  Output  signal  induced  by  2 r.f.  pulses  of  same 
width,  opsec.  and  height,  22  volts  p-p. 

(a)  Single  r.f.  pulse  applied  to  transducer  1. 

(b)  Single  r.f.  pulse  applied  to  transducer  2. 

We  also  considered  the  effect  of  different  pulse  heights  but  the  same  pulse  width 
applied  to  transducer  1 and  the  result  is  shown  in  the  oscillogram  of  Figure  6.  We  note 
that  there  is  an  increase  in  the  charge  storage  level  as  well  as  the  amplitude  of  the 
transverse  acoustoelectric  effect  due  to  a corresponding  increase  in  the  applied  voltage . 
The  stored  charge  vs.  the  r.f.  pulse  height  is  plotted  in  Figure  7.  It  can  be  seen  that 
after  a threshold  voltage  (V.p  = 20  volts),  the  charge  storage  increase  linearly 

with  applied  voltage.  The  fact  that  a threshold  voltage  exists  is  due  to  the  nonlinearity 
of  the  diode  response. 


Fig.  o.  Output  signals  generated  by  r.f.  input  of  6psec. 

pulsn  width  but  of  different  heights.  The  time 
interval  between  the  pulses  is  over  30  seconds. 

Charge  storage  will  also  increase  with  respect  to  an  increasing  r.f.  pulse  width 
until  saturation  sets  in.  Saturation  occurs  in  time  because  as  the  airgap,  depletion  and 
MOS  capacitors  are  charged  toward  a particular  voltage  the  current  due  to  the  applied 
pulse  decreases  until  eventually  the  current  is  equal  to  zero.  At  this  point  there  is  no 
additional  charge  accumulation  as  is  illustrated  effectively  in  Figure  8.  The  variation 
of  incremental  charge  deposition  for  the  case  of  5 repetitive  writings  with  the  r.f. 
pulse  width  as  a parameter  is  indicated.  We  can  see  that  the  incremental  charge  stor- 
age decreases  dramatically  after  the  first  write  in. 
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In  Fig.  9 we  consider  the  effect  of  the  r.f.  input  voltage  on  the  transverse  acous- 
toelectric effect.  The  magnitude  of  the  transverse  acoustoelectric  voltage  is  depend- 
ent on  the  conductivity  of  the  semiconductor  (charge  storage),  the  pulse  width  and  the 
pulse  height. 


O TRANSVERSE  ACOUSTOELECTRIC  VOLTAGE 


Fig.  9.  Transverse  acoustoelectric  voltage  vs . r.f.  input  voltage . 

In  Fig.  10  the  back  of  the  diode  mosaic  is  facing  the  LiNbC^  surface.  Figure  11 
illustrates  the  output  voltage  profile  corresponding  to  three  input  r.f.  pulses  of  equal 
height,  but  different  width.  Comparing  Fig.  11  with  Fig.  2 we  note  that  the  polarity  of  the 
the  acoustoelectric  voltage  remains  the  same  whereas  the  polarity  of  the  stored  charge 
is  reversed.  This  is  due  to  the  fact  that  the  acoustoelectric  effect  occurs  mainly  in- 
side that  portion  of  the  n type  semiconductor  located  between  2 diodes.  This  shows 
that  for  the  case  of  Fig.  10  the  acoustoelectric  effect  forward  biases  the  diodes  where- 
as in  the  case  of  Fig.  1 the  acoustoelectric  effect  reverse  biases  the  diode.  In  con- 
clusion for  optimum  operation  the  arrangement  illustrated  in  Fig.  10  is  preferable 
during  the  writing  process  whereas  that  of  Fig.  1 is  preferable  during  the  readout 
process . 
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Fig.  10.  Modified  configuration  of 
original  experiment. 


Fig.  11.  Output  signals  generated 
by  r.f.  pulses  with  dif- 
ferent pulse  widths  and  a 
pulse  height  of  30  volts 
p-p  (back  of  diode  array 
facing  LiNb03).  The 
time  interval  between  pulses 
is  over  30  seconds. 
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PERTURBATION  ANALYSIS  FOR  OPTICAL  GRATING  COUPLERS 
S.T.  Peng  and  T.  Tamir 

Dielectric  gratings  have  found  increasing  applications  as  couplers  of  light  beams 

into  thin-film  optical  waveguides.  Because  of  their  growing  importance,  theoretical 

studies  of  such  grating  couplers  have  been  quite  numerous  and  they  have  included  both 

1 2 3 

exact  and  approximation  formulations . However,  rigorous  treatments  ' of  the  perti- 
nent boundary- value  problem  are  quite  complex  and  require  time-consuming  and  elab- 
orate high-precision  computer  programs  to  yield  accurate  quantitative  results.  To 
avoid  this,  perturbation  procedures  have  been  proposed^  for  obtaining  approximate  re- 
sults, but  most  of  these  are  either  restricted  to  grating  having  shallow  grooves  or  to 
specialized  applications.  The  results  so  far  available  have  therefore  not  been  adequate 
to  develop  systematic  criteria  for  the  design  of  grating  couplers. 

In  the  past,  we  have  developed  an  improved  perturbation  procedure  that  was  shown 

4 5 

to  be  suitable  for  an  accurate  analysis  of  beam  couplers.  ' In  contrast  to  other  methods, 
this  approach  is  not  restricted  to  shallow  grating  grooves  and  it  can  be  applied  to  arbi- 
trary grating  profiles.  Furthermore,  the  analytic  formulation  of  the  electromagnetic 
fields  can  be  viewed  in  terms  of  equivalent  transmission-line  networks  that  lend  con- 
siderable insight  into  the  wave- coupling  mechanism.  These  networks  enable  one  to 
assess  the  role  of  each  grating  parameter,  as  discussed  separately  in  this  report  (see 
Reference  6).  In  addition,  we  have  observed  that  the  previous  analytic  results  can  be 
further  simplified  to  obtain  explicit  formulas  that  are  particularly  useful  for  design  con- 
siderations. 

To  carry  out  a perturbation  analysis  of  a dielectric  grating  waveguide  having  an 
arbitrary  profile  as  shown  in  Fig.  1(a),  the  relative  dielectric  constant  is  written  every- 
where as 


£ (x,  z)  = + £p{x,  z)  (1) 

where  the  subscript  u = s,f,  r and  a in  £ ^ denotes  the  substrate,  film,  residual  layer  and 

superstrate  (usually,  air)  regions,  respectively.  The  term  £ = e (x,z)  is  zero  every- 

P P 

where  except  in  the  grating  region  (0  < z < t ) where  u = g and  £ then  refers  to  the 

S 8 

(volume)  average  value  of  £ (x,  z)  inside  that  region. 


If  £p  vanishes  also  inside  the  grating  region,  we  are  dealing  with  a multi-layered 
uniform  medium.  This  multi-layered  medium,  which  is  regarded  henceforth  as  a basic 
structure,  supports  well-known  surface-wave  modes.  The  quantity  e^  =£p(x,z),  which 
describes  the  actual  periodic  variation  inside  the  grating  layer,  is  then  viewed  as  a 
perturbation  imposed  on  that  basic  (unperturbed)  structure.  Due  to  the  periodicity  of 
the  grating,  this  perturbation  can  be  written  as  a Fourier  series 


>14 
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Fig.  1.  Geometry  of  the  grating  structure: 

(a)  General  asymmetric  profile; 

(b)  Rectangular  profile. 

00 

E =£  (x,  z)  = /,  e (z)  exp(i2n  IT  x/d) 

K n=-«> 


(2 


Because  e was  defined  as  the  average  permittivity  inside  the  entire  grating  layer 

0 < z < t , the  n = 0 term  in  Eq.  (2)  satisfies 
g 

t 

r ® 

I f-Q{z)dz  = 0 


The  electric  and  magnetic  fields  may  also  be  written,  respectively,  as 


E = E + E , (< 

u p 

H = H + H . (- 

u p 

Here  E and  H refer  to  the  fields  in  the  actual  grating  structure  shown  in  Fig.  1(a), 
whereas  E and  H denote  the  fields  in  the  basic  structure.  Hence  E and  H refer  to 
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perturbations  superposed  on  and  when  the  periodic  variation  is  introduced  to 
modify  the  basic  structure  into  the  actual  grating  configuration.  Introducing  Eqs.  (4) 
and  (5)  into  Maxwell's  equations  and  noting  that  E^  and  also  satisfy  these  equations 
independently,  we  obtain 

VxE=iojuH,  (6 

P ^o  p ' 


V X H 


-iojE  (eE  +eE  +p) 
o u p p p 


where  and  e^  are  the  permeability  and  permittivity  of  vacuum,  while 

p=EpE^  (8) 

is  a known  dipole  distribution  generated  by  the  interaction  between  the  incident  field  and 
the  periodic  perturbation  £ . If  we  assume  that,  inside  the  brackets  of  Eq.  (7),  e E 
is  of  second  order  compared  to  the  other  terms,  that  equation  reduces  to 

V X H = -iu)  E (£  E + p)  . (9) 

p o u p 

The  solution  for  E and  H is  thus  resolved  into  first  finding  E^  and  along  the  basic 
structure  by  using  homogeneous  (source-free)  Maxwell's  equations.  We  then  find  E^ 
and  H along  the  same  basic  structure  by  employing  the  inhomogeneous  (dipole  source) 


Maxwell's  Equations  (6)  and  (9).  This  leads  to  a result  for  E and  H whose  accuracy  is 
affected  mainly  by  the  omission  of  ‘•p  Equation  (7). 

Because  of  the  two-dimensional  geometry,  the  surface-wave  modes  supported  by 
the  basic  structure  are  invariant  with  respect  to  y and  their  fields  are  of  either  the  TE 
or  TM  type.  The  field  solutions  of  a surface- wave  mode  in  the  basic  structure  can  be 
easily  obtained  by  standard  techniques.  In  particular,  the  total  surface- wave  power  can 
be  given  by: 


UO 

Pew  = f E xH  • X d2 
SW  ^ u o 


■|(Pq/w  for  TE  modes  , 

1 2 

— (B.  u)  £ £,./k  ,)t  ,,,  for  TM  modes  . 
2 ^0  of  zf'  eff 


where  the  asterisk  denotes  complex  conjugate  and  x^  is  a unit  vector  along  x.  Here 

t > t,  is  an  effective  thickness  which  can  be  assumed  to  contain  all  of  the  surface- 
eff  f 

wave  power.  For  the  general  case  of  multi-layer  structure,  the  formulae  for  t^^^  are 

rather  complicated  and  are  therefore  omitted  here.  However,  for  the  two  simple  limit 

cases  t =0  and  t -►  « already  discussed  above,  we  have 
8 g 
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t 


eff 


'•(S''  * Is''/ 


for  TE  modes 


t c {hf  + ) 

f m f m' 

(e.a  + (e  b,)^ 
' £ m'  ' m f' 


(11) 


for  TM  modes 


where  m = a or  g.  It  is  interesting  to  note  that  t ,,  is  identical  to  the  effective  thickness 

ctt  ^ 

that  takes  into  account  the  Goos-Hanchen  shift  in  thin-film  waveguides. 


The  presence  of  the  grating  can  modify  the  surface- wave  fields  so  that  they  leak 
energy  into  the  air  and  substrate  regions.  For  a first-order  perturbation,  the  perturbed 
(leaky  wave)  fields  are  represented  in  terms  of  the  space- harmonics,  each  of  which 
independently  satisfies  the  source- excited  transmission-line  relations: 


dV 

= ik  Z I - V 
dz  zn  n n n 


dl 

^ = ik  Y V 
dz  zn  n n 


(12) 

(13) 


where  the  distributed  voltages  v = v (z)  and  currents  j = j (z)  are  related  to  the  basic 

° n n ' ■'n  "^n 

surface-wave  fields  in  the  grating  region  and  are  therefore  known. 

The  leakage  parameter  a of  the  perturbed  (leaky)  wave  can  be  found  by  recalling 
that  the  power  P=  P(x)  along  the  perturbed  structure  varies  as  exp(-2Qfx).  We  there- 
fore have 


dP 

dx 


-2a  P 


(14) 


where  the  change  dP  in  the  power  is  due  to  those  harmonic  fields  that  radiate  away  from 
the  structure.  We  then  obtain 


- If  “ -rad  = E P„  • 

n 

Pn  = ^ Pi^’  = I I ^ + I | ^ . (15b) 

where  Re(q)  denotes  the  real  part  of  q.  Here  Is  the  total  power  per  unit  length 

(along  x)  that  has  leaked  into  the  substrate  and  air  regions.  As  this  power  is  given  by 
those  scattered  waves  that  propagate  (rather  than  decay)  in  those  regions,  n in  Eq.  (15) 

( 2L  ) ^ S ) 

runs  over  only  those  values  for  which  Y'  ^ and/or  Y'  ' are  real.  Because  e > c , it  is 

/ 1 (a)  ° ™ 

easy  to  verify  that  sometimes  only  p^  ' may  contribute  to  p^,  while  p^  ' is  zero,  i.e., 

that  n-th  diffracted  order  radiates  in  the  substrate  but  not  in  the  air  region.  Combining 


T 
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Eq.  (14)  with  Eq.  (15),  we  get 


rad 

2P 


(16) 


so  that  denotes  the  (partial)  leakage  accounted  for  by  the  n-th  line  in  which  propaga- 
tion occurs  in  the  air  and/or  the  substrate  region. 

The  coupled  transmission  line  equations,  Eq.  (12)  and  Eq.  (13),  may  be  solved  in 
many  different  ways.  For  the  present  problem,  we  find  that  it  is  more  convenient  to 
introduce  the  linear  transformation: 


U*  = 4 (V  ± Z I ) 
n 2 ' n ” 


n n' 


u = (v  ± Z j ) , 

n 2 ' n n-’n'  ’ 

to  obtain  the  two  decoupled  first  order  differential  equations: 
+ 


(17) 

(18) 


dU 


dz 


- = ±ik  U"-  u- 


zn  n 


(19) 


It  may  be  readily  verified  from  Eq.  (19)  that,  for  the  time  dependence  exp(-iaj  t)  adopted 
here,  and  are  voltage  waves  that  travel  along  the  +z  and  - z directions,  respective- 
ly. 

Inside  the  grating  region,  the  solution  of  the  two  decoupled  first-order  differential 
equations  in  Eq.  (19)  can  readily  be  written  as 


= [U^O)  ■ f exp(+ik  z')]  exp(±ik  z) 


zn 


(20) 


The  boundary  conditions  at  z = 0 and  ^ “Ig  1^®  written  as 

U^O)  = r*  U'(0)  and  U‘(t  ) = r"^  u'^(t  ) 
n ' gn  n ' n'  g'  gn  n'  g' 


(21) 


where  r and  r are  the  reflection  coefficients  of  the  n-th  harmonic  at  the  upper  and 

gn  gn  t-t' 

lower  surfaces  of  the  grating,  respectively.  Introducing  Eq.  (20)  into  Eq.  (21),  we  ob- 
tain 


Un(0) 


exp(i2k  t ) - G 
n gn  zn  g'  n 


(22) 


) 

n g 


(G  - r G ) exp(ik  t ) 
n gn  n zn  g' 

R_ 


(23) 
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with 


exp(±ik  z)dz 
n -'q  n z n 


(24) 


R 


n 


exp(i2k  t ) 
^ zn  g' 


(25) 


It  is  noted  that  is  generally  non- vanishing  for  n ^ 0.  The  condition  R^  ^ 0 for 

n ^ 0 is  equivalent  to  excluding  the  resonant  case  of  Bragg  reflection  p d = mir . Hence 

- + ■ ^ 

U^(0)  and  U^(tg)  are  well  determined.  We  may  therefore  find  the  powers  needed  in 

Eq.  (15)  by  using  Eq.  (21)  to  yield 


= Re[Yj^^>  I 1^]  = (1  - I 1^)  | U^(tg)  Re[Yjf 

= Re[Yj^®>  I 1^]  = (1  - I r’^  |^)  |u;(0)  | ^ Re[Yj^8)] 


(26) 

(27) 


With  the  values  of  p^  given  by  Eqs.  (26)  and  (27),  the  leakage  parameter  a can  be  found 
via  Equations  (15)  and  (16). 


In  general,  the  foregoing  relations  yield  a in  the  form  of  an  elaborate  function  of 

all  the  grating  parameters.  However,  for  a grating  having  a rectangular  profile  shown 

in  Fig.  1(b),  most  of  the  preceding  relations  simplify  considerably . In  this  case,  the 
+ 

integrals  for  in  Eq.  (24)  can  be  explicitly  evaluated  as: 


+ 

G = 
n 


u e e 
o n 

2Y 


C (K  1) 


1 - exp[i(k  ±k  )t  ] 
^ z zn 

k ± k 
z zn 


+ (K^  ±l)r  exp(i2k^tg) 


1 - exp[-i(k  ±k  )t  1 
^ z zn 

k ± k 
z zn 


(28) 


where  C is  a normalization  constant  and  K is  defined  by: 

n ’ 

10  for  TE  modes,  , 

(29) 

6„S  /k  k , for  TM  modes 
^O^n  z zn 


Inserting  G * into  Eqs,  (22)  and  (23),  we  can  find  o via  Equations  (15)  and  (16).  This 
can  be  easily  carried  out  by  numerical  computation.  However,  for  small  and  large 
values  of  t , we  can  derive  the  following  simplified  approximations: 
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a = 


A Y 

[(^  K ■^)  U)  £ £ t COS  'I'l 

4Y  P '■  n Y o n g ■' 

n g ® 


for  t small 
g 


K + 1 

n . 1 2 

. T-.  I •; — ; u)  £ £ cos  $ 

4Y  P ' k + k on  ' 


(30) 


n 


n 


zn 


for  t large 
g 


where  A and  B are  both  nearly  equal  to  unity  for  most  practical  cases. 


n n 

Another  interesting  quantity  is  the  partition  of  energy  leaked  into  the  air  and  sub- 
strate regions.  The  upper  and  lower  bounds  for  the  ratio  of  power  radiated  into  the  air 
and  the  substrate  regions  are  given,  respectively,  by 


1 

+ 

|r^l 

' n ' 

1 

+ 

l’•nl 

p!.” 

ub 

1 

- 

i':i 

1 

- 

p!.*' 

1 

- 

1 

- 

pL” 

ib 

1 

+ 

1 

+ 

(31a) 


(31b) 


Since  the  geometrical  mean  of  Eq.  (31)  is  unity  and  | r^  | is  usually  small,  the  leakage 
power  tends  to  divide  equally  between  the  substrate  and  air  regions  in  most  cases. 


The  formulas  in  Eqs,  (30)  and  (31)  are  simple  enough  for  computations  using  a 
hand- calculator;  they  are  therefore  particularly  useful  for  practical  design  of  periodic 
optical  couplers,  as  we  shall  show  in  the  following  report. 
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DESIGN  OF  GRATING  COUPLERS 
T.  Tamir  and  S.T.  Peng 

The  coupling  performance  of  a grating  coupler  is  known^  to  be  most  strongly  in- 
fluenced by  the  leakage  parameter  a,  which  characterizes  the  leaky  wave  supported  in 
the  grating  region.  The  determination  of  a has  been  discussed  in  the  preceding  report. 
The  present  report  focuses  on  the  derivation  of  simplified  but  accurate  formulae  for  a 
and  on  the  design  criteria  that  can  thereby  be  inferred  for  fabricating  gratings  having 
desirable  characteristics. 

Usually,  the  need  for  constructing  a grating  coupler  occurs  when  a planar  thin- 
film  optical  guide  is  given,  i.e.,  the  equivalent  refractive  index  N = P^^/k^  of  that  guide 
is  known.  The  question  that  arises  in  that  case  is  to  design  a grating  with  periodicity  d 
and  height  t^,  so  that  d and  t^  must  be  determined  in  such  a manner  as  to  reach  a desired 
value  for  the  leakage  parameter  a . In  addition,  one  usually  desires  to  have  a single 
diffraction  order  in  order  to  maximize  the  coupling  efficiency;  this  is  achieved  by  requir- 
ing that 

I N - (\/d)  I < ^ - 1 , (la) 


2(\/d)  - N > 7%  > 1 . (lb) 

where  the  notation  is  identical  to  that  in  the  preceding  report. 

Conditions  (1)  are  summarized  in  Fig.  1,  wherein  the  shaded  regions  indicate  the 
range  of  values  for  N and  \/d  that  lead  to  a single  outgoing  beam  in  each  of  the  two  air 
regions  above  and  below  the  guide-grating  structure  on  which  a surface  wave  is  incident, 
i.  e.  , in  an  output  coupler.  By  reciprocity,  a beam  incident  on  an  input  coupler  will 
satisfy  analogous  conditions  for  maximizing  the  coupling  performance. 

After  establishing  a suitable  value  of  d/\  via  Fig.  1,  it  now  remains  to  determine 
a corresponding  value  of  that  would  yield  the  desired  leakage  parameter  a . For 

this  purpose,  we  first  consider  gratings  having  rectangular  profiles.  Using  Eq.  (30) 
from  the  preceding  report,  we  then  find: 

(1)  For  a t < 1: 

' ' rr  rr 


= 


2(c^-nX-^)^  a .^..2 


2(E,-N^)(e^-£  )^N^-e  + (NA/V)^[N- (\/d)]^  t^ 

I * « 5 M_ 


Ntft 

f g 


.,2  , , / .2,  .,2.  At  ,, 

N -E^  + (Ej^Af)  (Ef-  N ) eff 


for  TE  modes; 


for  TM  modes . 
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Fig.  1.  N = Pg^/kg  versus  X/d,  indicating  range  of  values  (shown  shaded)  for 
exciting  a single  diffraction  order  above  the  grating.  Forward  and 
backward  leakage  conditions  are  indicated  by  different  shadings. 


{ c ) 

Here  the  superscript  c in  ' indicates  that  the  result  applies  to  the  canonic  rectangular 

profile  having  d^  = d/2,  whereas  the  subscript  t indicates  that  Eq.  (2)  holds  for  small 

values  of  t . 

g 


(2)  For  a t > 1: 

g g 


^f-N‘ 


{£  -E  )‘ 

^ r a 


2.,,  . 2N-  (X/d)  Xt^ 

f 


( 2Tr‘-N(E,-e„)  '‘eff 


for  TE  modes; 


(3) 


Ef-N'^  (Er’Ea)^  + (Na/X)''[N- (X/d)] 

2N-  (X/d)  / ,2,  -,2. 

2ir  Ne,e_  ' N -E  /e^)  (e^-N  ) 


Xd 


At 


, for  TM  modes  . 


6-  - 
^ g 


eff 


Again,  the  superscript  c in  refers  to  the  canonic  rectangular  profile  with  d^  -d/2; 

the  subscript  T indicates  that  Eq.  (3)  holds  for  large  t^. 

Relations  (2)  and  (3)  are  basic  and  contain  grating  parameters,  all  of  which  are 
readily  known  except  for  tg££-  The  value  of  t^££  may  be  (somewhat  grossly)  approximat- 
ed by  t£;  for  a better  approximation,  we  use  Eq.  (11)  with  m = a in  Eq.  (2),  whereas  in 
Eq.  in  Eq.  (3)  we  utilize  Eq.  (ll)withm  = g.  Here  Eq.  (1 1)  refers  to  that  given  for 
t^^^  in  the  preceding  report. 
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The  accuracy  of  the  above  approximations  is  illustrated  in  Fig.  2,  which  describes 

results  for  a typical  grating  operating  in  the  fundamental  TEq  mode.  Leakage  occurs 

only  in  the  n = -l  order.  The  solid  curves  have  been  obtained  by  using  the  perturbation 

procedure  presented  in  the  preceding  report,  whereas  the  points  denoted  by  X signs  de- 

2 

note  values  calculated  by  a rigorous  approach.  The  approximations  (2)  and  (3)  are 
shown  by  dashed  curves.  The  upper  and  lower  bounds  a \ and  a respectively,  refer 

to  limit  values  that  can  easily  be  determined  via  the  parameter  of  Eq.  (30)  in  the 

preceding  report. 


Fig.  2.  Variation  of  normalized  leakage  a\  and  power  ratio  q versus  t \, 
for  the  TEq  mode  in  a canonic  grating.  ^ ® 

Figure  2 describes  a typical  result  for  TE  modes  and  similar  results  hold  for  TM 
modes.  It  is  also  noted  that  the  efficiency  t|^  of  coupling  into  the  upper  air  region 
varies  around  50%,  as  predicted  at  the  end  of  the  preceding  report.  It  is  significant  to 
observe  that  the  dashed  lines  representing  Eqs.  (2)  and  (3)  in  Fig.  2 are  very  close  to 
the  solid  line  representing  the  more  accurate  perturbation  result.  Furthermore,  the 
latter  results  are  less  than  30%  higher  in  value  than  the  exact  results  shown  by  the  X 
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signs  in  Figure  2.  Such  an  accuracy  is  well  within  that  required  for  design  purposes  and 
is  consistent  with  achievable  tolerances.^ 

For  gratings  having  profiles  that  are  different  from  rectangular,  results  (2)  and 
(3)  can  be  extended  provided  the  different  profile  is  symmetrical.  In  that  case,  the  new 
leakage  factor  a is  given  in  terms  -jf  a of  the  rectangular  profile  by 


or 


2 (c) 

P a 


(4) 


where  p is  a reduction  factor  which  can  be  readily  calculated  via  an  integration  of  e(x,z) 
over  the  profile  shape.  This  integration  is  omitted  here  but,  as  an  example,  the  result 
for  a trapezoidal  shape  is  given  by 


sin  TT  6 
n'  6 


sin  — 6) 


where 


6 = 
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for  t > t 

g gc 


(5) 


(6) 


Here  t = l/a  is  known,  while  d,  and  d_  refer  to  the  lower  and  upper  basis  of  the 
gc  g 12 

trapezoid,  respectively. 


Results  for  a trapezoidal  profile  are  shown  in  Fig.  3,  where  again  it  is  seen  that 
the  dashed-line  results  (2)  and  (3),  as  modified  by  (5),  are  in  good  agreement  with  the 
solid-line  results  of  the  accurate  perturbation  analysis. 

The  construction  of  a grating  coupler  having  a prescribed  value  of  a can  thus  be 
based  on  the  use  of  results  (2),  (3)  and  (5),  which  can  be  systematized  into  a series  of 
design  data  or  graphs  that  are  catalogued  in  terms  of  the  given  parameters  N,  e^,  and 
suitable  profile  shapes.  Such  a procedure  greatly  simplifies  the  prediction  of  the 
coupling  characteristics  and,  by  reducing  the  initial  design  steps  to  a set  of  simple  cal- 
culations of  algebraic  relationships,  it  avoids  the  use  of  complex  computer  programs 

2 

and  very  time  consuming  calculations  by  means  of  high-precision  computers. 


More  elaborate  details  of  the  aspects  presented  here  are  given  in  a comprehensive 
paper,  where  the  effects  of  asymmetric  (blazed)  grating  profiles  are  also  discussed.^ 
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Fig.  3.  Variation  of  normalized  leakage  a\  and  power  ratio 
r|  versus  tg  \ for  the  TEg  mode  in  a grating  having 
a symmetric  trapezoidal  profile. 
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NETWORK  METHODS  FOR  DIELECTRIC- GRATING  APPLICATIONS 
T.  Tamir,  S.  T.  Peng  and  K.C.  Chang 

The  similarities  between  guided-wave  components  in  the  two  areas  of  integrated 
optics  and  microwave  engineering  have  been  well  recognized.  However,  while  equiva- 
lent networks  involving  lumped  and  distributed  elements  have  served  as  a powerful  tool 
in  treating  a wide  range  of  microwave  problems,  the  use  of  network  methods  in  inte- 
grated optics  has  been  so  far  very  limited  in  scope.  The  aim  of  this  report  is  to  show 
the  effectiveness  of  equivalent  networks  in  integrated  optics  by  presenting  their  novel 
application  to  dielectric  gratings,  which  play  a dominant  role  in  beam  couplers,  filters, 
distributed- feedback  lasers  and  other  devices  that  incorporate  periodic  structures. 

Network  terminology  has  already  been  employed  in  the  early  microphotolithographic 

work  at  infra-red,^  while  equivalent  networks  have  been  subsequently  used  mostlv  in 

transverse-resonance  analysis  of  dispersion  curves  for  thin- film  waveguides  of  rhe 
2 3 

strip  or  planar  varieties.  Most  recently,  dielectric  gratings  of  the  type  sketched  in 

4 

Fig.  1 have  been  shown  to  be  rigorously  described  by  the  transverse  network  configu- 


Z 


Fig.  1.  Typical  dielectric  gratings  used  in  applications  for  integrated  optics. 

ration  given  in  Figure  2.  Here  B denotes  a lossless  network  that  couples  all  of  the 
space  harmonics,  each  of  which  is  represented  by  a transmission- line  circuit  in  the 
regions  outside  the  grating.  Whereas  the  complete  network  is  rather  complicated  if  a 
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0 Grating 


Fig.  2.  Equivalent  network  representation  of  the 
dielectric- grating  configuration. 

large  number  of  space  harmonics  must  be  accounted  for,  it  nevertheless  leads  to  sys- 
tematic computational  procedures  for  structures  with  a larger  number  of  layers  and/or 
for  gratings  having  more  complex  profiles. 

A great  simplification  in  the  equivalent  network  of  Fig.  2 can  be  achieved  by 
adopting  a perturbation  approach^  which  assumes  that  the  grating  region  appears  as  a 
modification  of  a uniform  layer  having  a relative  permittivity  e^.  One  can  then  express 
the  permittivity  in  that  layer  as 

e(x)=E  [1  + X «„exp(i-^x)]  , (1) 

® n/O 

where  the  summation  is  regarded  as  a perturbation  term.  The  electric  field  E = for 
TE  modes  can  be  written  as 

E(x,  z)  = ^ V^(z)  exp(i  K^x)  , (2) 

n 

where 


•<n  = '^0 


n = 0,  1. 1 f i 2,  • • • 


(3) 


and  Kq  is  generally  complex. 

If  Eqs.  (1)  and  (2)  are  introduced  into  the  wave  equation  and  only  first-order 
terms  are  retained,  one  gets 


dz 


(4) 
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which  holds  for  all  n/O.  However,  Vq(z)  represents  a presumably  known  (unperturbed) 
incident  field.  Equation  (4)  implies  that  each  partial  electric-field  (space  harmonics) 
£^(x,  z)  in  Eq.  (6)  can  be  found  in  terms  of  the  voltage  ^^(2)  along  the  equivalent  trans- 
verse network  shown  in  Figure  3.  Unlike  Fig.  2,  the  network  in  Fig.  3 is  a simple. 


Fig.  3.  Simplified  network  representation  for  TM-mode 
problems.  For  TE  modes,  the  network  is  un- 
changed, but  Vjj(z)  = 0. 

uncoupled  and  independent  transmission-line  circuit  characterized  by  propagation  fac- 
tors k^'^^  and  characteristic  admittances  , where 

zn  n zn 


zn  ' u n' 


(5) 


Here  the  index  u = a,g,f  or  s denotes  the  air,  grating,  film  or  substrate  region,  respec- 
tively. The  voltage  is  determined  by  the  distributed  current  j^(z),  which  is  pro- 

portional to  the  right-hand  side  of  Eq.  (4)  and  is  known  for  any  given  incident  voltage 
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Vq(z).  a similar  treatment  holds  for  TM-modes,  except  that  the  voltage  V^(z)  is  then 
determined  by  both  distributed  currents  and  voltages  v^(z)  inside  the  grating  re- 

gion (0  < z < t^),  as  indicated  in  Figure  3. 


The  principal  advantage  of  this  simplified  approach  is  that  it  can  be  easily  applied 
to  any  given  excitation  problem.  Thus,  the  incident  field  may  be  a plane  wave  imping- 
ing obliquely  on  the  grating  (for  scattering  phenomena),  a surface  wave  traveling  longi- 
tudinally (for  beam- coupling  or  filtering  applications),  or  a standing  wave  along  the 
grating  (for  distributed-feedback  lasers).  Each  one  of  these  situations  is  distinguished 
from  the  other  only  by  the  fact  that  the  voltage  and  current  sources  are  different.  How- 
ever, these  sources  are  known  in  every  case,  and  they  are  simply  prescribed  by  the 
given  incident  field.  Because  of  their  relative  simplicity,  equivalent  networks  such  as 
that  in  Fig.  3 are  extremely  helpful  in  providing  both  physical  insight  and  accurate 
quantitative  evaluations  for  practical  design  problems. 


As  an  example  of  an  optical  beam- coupler  application,  which  requires  a leaky- 
wave  regime,  Fig.  4 shows  the  normalized  attenuation  ord/Zir  of  leaky  waves  supported 


Fig.  4.  Normalized  leakage  ad/Zir  versus  grating  thickness  tg 
for  several  modes  guided  by  a corrugated  GaAs  slab. 
Points  X indicate  results  obtained  by  the  exact  analysis. 


94 


OPTICS 


by  a corrugated  GaAs  wafer  used  in  converting  a light  beam  into  a surface  wave,  or 

vice-versa.  In  this  case,  the  minima  and  maxima  of  the  curves  can  be  directly  related 

to  the  height  t of  the  grating.  As  t varies,  the  transmission  line  between  z = 0 and 
8 8 

z =tg  in  Fig.  3 also  varies;  hence  the  leaked  energy  (and  therefore  a)  exhibits  maxima 
and  minima  because  of  the  cyclic  variations  of  the  input  impedances  "seen"  by  the 
sources . 

Another  important  advantage  of  this  network  approach  is  that  it  can  handle  gratings 
with  arbitrary  periodic  profiles.  Consequently,  results  can  be  obtained  for  configura- 
tions that,  unlike  the  rectangularly  corrugated  gratings  of  Fig.  1,  do  not  lend  themselves 
to  an  exact  solution  of  the  pertinent  boundary- value  problem. 

A particularly  interesting  case  is  the  asymmetrical  triangular  profile  shown  in 
Fig.  5(a),  which  can  produce  directional  discrimination  ("blazing")  of  incident  surface 


1 

1 

! 

(a) 


(b) 


BLAZING  OF  SURFACE  WAVES 

Fig.  5.  Blazing  of  surface  waves: 

(a)  Asymmetric  grating  profile  for  directional 
discrimination  of  incident  surface  waves; 

(b)  Result  of  grating  asymmetry,  showing  that  a 
surface  wave  incident  from  the  right  is  leaked 
mostly  into  substrate  whereas  a surface  wave 
incident  from  the  left  is  leaked  mostly  into 
the  air  region. 
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waves.  This  is  achieved  by  designing  the  grating  so  as  to  induce  a stronger  energy 
leakage  into  either  the  upper  (air)  or  the  lower  (substrate)  regions,  depending  on  wheth- 
er incidence  is  from  the  right  or  left,  respectively,  as  suggested  in  Figure  5(b).  In 
network  terms,  the  desired  blazing  effect  is  obtained  by  choosing  the  sources  jjj(z)  and 
v^(z)  so  as  to  direct  the  power  flow  selectively  into  either  the  air  or  the  substrate  re- 
gion. For  example,  the  sources  can  be  judiciously  distributed  along  the  transmission 
line  segment  0 < z < t^  so  as  to  interfere  destructively  for  the  field  in  the  substrate 
region  while  interfering  constructively  for  the  field  in  the  air  region.  After  determin- 
ing a suitable  source  distribution,  the  geometrical  shape  of  the  grating  profile  can  be 
obtained  by  means  of  Fourier  transforms. 

To  summarize,  the  greatest  advantage  of  the  equivalent-network  representation 
is  that  it  permits  a direct  interpretation  of  the  grating  characteristics  in  terms  of  the 
constituent  parts  of  the  transmission-line  circuit.  Thus,  the  leakage  properties  of 
grating  couplers,  the  asymmetric  behavior  of  triangular  gratings,  the  Bragg- reflection 
regime  of  distributed-feedback  lasers,  etc.  can  all  be  phrased  in  terms  of  the  trans- 
mission-line parameters  and  the  fields  associated  with  them.  By  extension,  the  appli- 
cation of  such  network  methods  to  a wider  class  of  in  teg  rated -optics  configurations  is 
expected  to  provide  a powerful  analytic  technique  for  design-oriented  purposes. 
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ON  SIMULTANEOUS  BLAZING  OF  TRIANGULAR  GROOVE  DIFFRACTION 
GRATINGS 

L.  S.  Cheo,  J.  Shmoys  and  A.  Hessel 

Numerical  results  for  plane  wave  scattering  from  a perfectly  conducting  diffrac- 
tion grating  with  triangular  groove  profile  show  that  simultaneous  blazing  with  100% 
efficiency  in  both  polarizations  is  obtainable  in  a Littrow  arrangement.  The  range  of 
angles  of  incidence,  for  which  such  simultaneous  blazing  occurs,  is  considerably  nar- 
rower than  that  obtainable  with  a rectangular  groove  profile, 

A.  Introduction 

In  a previous  paper  ^ it  was  shown  that  a rectangular- groove  perfectly  conducting 
grating  can  be  blazed  to  diffract  all  of  the  incident  energy  into  the  n = -1  spectral 
order,  provided  two  conditions  are  met.  One  of  these  is  the  Bragg  condition,  equiva- 
lent to  placing  the  grating  in  a Littrow  mount;  the  other  is  that  the  depth  of  the  grating, 
h,  be  adjusted  to  the  correct  value,  which  is  a function  of  the  angle  of  incidence  0,  and 
of  the  profile.  Furthermore  it  was  pointed  out  that  by  examining  the  curves  h vs.  0 for 
both  TE(P)  and  TM(S)  polarizations  and  locating  their  intersections,  conditions  for 
simultaneous  perfect  blazing  could  be  found.  Further  calculations  of  simultaneous  per- 
fect blazing  by  a rectangular  groove  grating  together  with  its  experimental  verification 

2 

in  the  microwave  region  was  recently  published  by  Jull  et  al.  ; additional  calculated 

performance  data,  including  the  effect  of  imperfect  conductivity  was  provided  by 
3 

Roumiguieres  et  al. 

The  question  of  possible  simultaneous  blazing  of  triangular  groove  gratings  was 

4 

raised  by  Stroke  who  indicated  that  somewhat  shallower  gratings  than  a standard 
echelette  give  better  overall  results  even  though  they  do  not  have  the  perfect  blazing 
property  in  the  TM  polarization.  Echelette  (right  angle  triangular)  grating  efficiency 

5 

was  also  studied  theoretically  by  Maystre  and  Petit  whose  results  indicate  that  simul- 
taneous blazing  does  not  occur  in  this  case.  McPhedran  and  Waterworth^  examined 
theoretically  the  efficiency  of  triangular  profile  gratings  and  came  to  the  conclusion 
that  for  good  performance  in  the  TE  polarization  a grating  deeper  than  the  echelette  is 
desired.  They  did  not  search  for  conditions  for  perfect  blazing. 

In  the  present  work  dealing  with  a perfect  blazing  of  perfectly  conducting,  infinite 
triangular  gratings,  we  have  followed'the  conjecture  of  our  previous  paper*  that  under 
the  constraint  of  the  Bragg  condition  X = 2d  sin  0 (i.e.  of  Littrow  mount)  a grating  of 
any  groove  shape  can  be  blazed  perfectly  simply  by  varying  the  groove  depth  h.  In  the 
present  context  this  implies  keeping  f/d  constant  (cf.  Fig.  1),  setting  X to  2d  sin  0,  and 
varying  h to  find  a zero  of  the  specular  reflection  coefficient.  It  may  be  worth  a 
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Fig.  1 Triangular- groove  grating. 

comment  that  a right  angle  echelette,  because  of  the  constraint  imposed  on  its  depth 
would  not  be  expected  to  yield  simultaneous  blazing  in  both  polarizations.  This  is  in 
fact  observed  in  Fig.  2,  which  shows  the  TE  efficiency  of  right  angle  echelette  grat- 
ings 100%  blazed  in  the  TM  polarization  as  a function  of  groove  depth.  The  data  in 
this  figure  has  been  taken  from  Reference  5. 


Fig.  2 Efficiency  in  TE(P)  polarization  vs.  blaze  angle  of  an 
echelette  grating  blazed  for  TM(S)  polarization. 

B.  Numerical  Results 

The  numerical  results  of  this  report  were  obtained  using  the  theoretical  develop- 
ment of  Jovicevic  and  Sesnic^  with  minor  modifications  in  formulating  the  set  of  si- 
multaneous Equations  (12)  and  (13),  (32)  and  (33).  To  accommodate  the  nonrectangular 
triangle  groove  shape,  a Bessel  junction  routine  for  nonintegral  values  of  index 
was  developed,  thus  extending  the  applicability  and  efficiency  of  the  computer 
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algorithm  based  on  Equation  (7).  The  Bessel  function  routine  was  based  on  the  power 
series  expansion  truncated  after  the  15th  term.  The  total  number  of  space  harmonics 
(diffraction  orders)  employed  in  computations  was  four,  with  a similar  number  of 
wedge  modes  (in  the  triangular  region).  Numerical  stability  of  the  solution  was  spot- 
checked  with  a larger  (6-8)  number  of  modes.  Numerical  integration  of  scalar  pro- 
ducts involved  26  mesh  points  over  the  grating  period.  Computations  were  performed 
on  a system  IBM  370/158,  in  a time  sharing  mode,  using  PL/1.  The  average  CPU 
time  per  point  was  6sec. 

Figure  3 shows  the  required  depth  h/d  for  100%  blazing  in  the  TM(S)  polarization 


Fig.  3 Relative  depth,  h/d,  vs.  angle  of  incidnece,  0,  of  a 

triangular  groove  grating  100%  blazed  for  TM(S)  polarization. 
Bragg  condition,  X/2d  = sin  6 was  maintained;  f/d  = 0-94 — 34-  , 
f/d  = 0.25-«-  , and  t/d  = 0.5  A — — 

for  t/d  =0,  .25  and  . 5.  We  see  here  two  disjoint  branches.  In  Refs.  1-3  only  the 
lower  branch  was  shown  for  rectangular  groove  profile.  Actually  an  infinity  of  higher 
branches  is  expected,  as  can  be  clearly  seen  on  the  rectangular  groove  grating  ex- 
ample. We  observe  that  the  character  of  the  curves  in  both  branches  is  similar; 
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furthermore  the  lower  branch  curve  is  similar  to  that  for  the  half  open  rectangular 
profile,  in  that  with  decreasing  values  of  6 the  curve  h/d  rises,  goes  through  a maxi- 
mum, dips  and  rises  sharply  near  6 = 20°  i,  e. , near  a Rayleigh  angle  for  n=+l  and 
n=-2  spectral  orders.  It  is  of  interest  to  note  also  that  there  is  a weak  dependence  on 
f/d.  Figure  4 shows  the  relative  depth  behavior  for  TE  modes.  Again  we  see  a 


% 


Fig.  4 Relative  depth,  h/d,  vs,  angle  of  incidence  0 of  a triangular 
groove  grating  100%  blazed  for  TE(P)  polarization.  Bragg 
condition, \ /2d  = sin  0,  was  maintained;  f/d  = 0-3f — 3€—  , 
f/d  = 0.  25  — , and  f/d  = 0.  5 “A'*  ~~  A ■ 

similarity  to  the  half  open  rectangular  grating  in  that  the  rise  is  monotonic,  and  that 
the  effect  of  f/d  is  weak.  Figure  5 is  a superposition  of  both  the  TM  and  TE  blaze 
curves  for  f/d  = .25.  It  clearly  shows  that  intersections  exist,  i.  e.,  simultaneous 
blazing  is  feasible  but  that  the  physically  meaningful  operating  condition  would  prob- 
ably correspond  to  the  intersection  of  the  upper  TM  branch  with  the  TE  curve.  The 
lower  branch  simultaneous  blaze  point  on  the  steep  part  of  TM  curve  would  be  expected 
to  be  quite  susceptible  to  e.  g.  edge  effects,  loss  and  grating  imperfections.  Numer- 
ical data  for  the  simultaneous  blaze  points  found  is  shown  in  Table  1. 
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Fig.  5 Relative  depth,  h/d  vs.  angle  of  incidence  6 of  a 100%  blazed 

triangular  grating,  under  Bragg  condition  X/2d  = sin  0 and  with 
f/d  = 0.25,  for  both  polarizations:  TE(P)  o o and  TM(S)  -3S- 

TABLE  I..  Parameters  of  simultaneously 

blazed  triangular  groove  gratings. 


Lower  branch 


Vertex  location 

Normalized  depth 

Angle  of  incidence 

f/d 

h/d 

0 

Upper  branch 
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C.  Conclusions 


The  conjecture  of  Ref.  1 with  respect  to  simultaneous  blazing  of  gratings  with  an 
arbitrary  groove  profile  has  been  verified  numerically  for  the  case  of  a triangular 
shape.  It  has  been  shown  that  simultaneous  100%  blazing  of  a perfectly  conducting  in- 
finite grating  with  a triangular  profile  is  possible.  However,  unlike  that  for  the  rec- 
tangular shape,  which  permits  a simultaneous  blazing  for  a wide  range  of  incidence 
angles,  such  angular  range  is  more  restricted,  for  a triangular  groove  profile,  be- 
cause of  the  weak  dependence  on  f/d. 
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RADIATION  PATTERNS  OF  DIELECTRIC  SLAB  WAVEGUIDES 
L.  Bergstein 

A.  Introduction 

In  an  earlier  report^  we  have  developed  a method  which  allows  the  determination 
pf  the  radiation  patterns  of  terminated  dielectric  slab  waveguides.  As  an  example  of 
the  application  of  the  method,^  we  have  determined  the  radiation  patterns  of  two  wave- 
guides, one  of  optical  thickness  nd  =0.20X  and  one  of  thickness  nd  = 1. 20V.  The  results 
show  that  the  radiation  patterns  of  the  waveguide  of  thickness  0.20V  are  considerably 
wider  than  those  of  the  waveguide  of  thickness  1.20V,  although  exactly  the  opposite  is 
to  be  expected:  as  the  width  of  a waveguide  increases  its  radiation  pattern  decreases. 

In  view  of  the  importance  of  this  problem  in  integrated  optics,  we  have  decided  to  in- 
vestigate this  question  in  more  detail. 

A close  examination  of  the  problem  at  hand  shows  that  for  dielectric  waveguides 
of  practical  interest  in  the  optical  region,  the  power  carried  by  the  non- radiating 
modes  and  in  the  waves  reflected  at  the  waveguide  termination  is  small  compared  to 
the  power  carried  in  the  propagating  modes.  For  the  approximate  determination  of 
the  radiation  patterns,  we  can  therefore  neglect  these  waves  and  consider  only  the 
radiating  modes  and  the  evanescent  waves  coupled  to  them.  The  error  in  the  results 
so  obtained  should  be  in  the  order  of  a few  percent.  Indeed,  this  is  verified  by  a com- 
parison of  our  results  with  those  obtained  by  more  exact  but  highly  cumbersome 
methods . 

B.  Determination  of  the  Radiation  Patterns 

Consider  a dielectric  slab  waveguide  of  thickness  d and  refractive  index  n im- 
mersed in  a medium  of  refractive  index  n^  < n.  The  waveguide  is  terminated  and 
radiates  into  a region  of  refractive  index  n^.  As  shown  in  Fig.  1,  we  choose  a Cartesian 

I 

I 

I 

I 


I 

I 

I 

Fig.  1.  Geometry  of  the  dielectric  slab  waveguide. 
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coordinate  system  with  the  z-axis  coinciding  with  the  axis  of  the  waveguide  and  assume 
that  there  is  no  field  variation  in  the  y direction.  Such  a waveguide  can  support  non- 
coupled  TE  and  TM  modes,  with 


E = [0.  Ey.  0 ] 


H = [H  , 0,  H ] 


for  the  TE  modes,  and 


E = [E  , 0,  E^  ] and  H = [0,  H , 0 ] 

^ X Z ^ V 


for  the  TM  modes.  If  we  let  \|i(x,  z)  to  be  equal  to  either  E^  (for  TE  modes)  or  to  Hy 
(for  TM  modes),  it  is  readily  found  upon  application  of  Maxwell's  equation  and  the 
appropriate  boundary  conditions  that 


( - 1)”^  cos  <j)  e 


iku  z k5(x  + ^ ) 


, for  X < - 


iku  z 

\i;(x,  z)  = e cos(kw^x-  m-^)  , 

iku^z  -k?(x-|) 
cos  4>  e e 


for  - 2 1 ^ 1 + 2 ’ 


for  ^ < X 


In  these  equations,  X is  the  free-space  wavelength,  k^  = Ztt/X  is  the  free-space  propa- 
tion  constant,  k = nk^,  6^  is  the  synchronous  angle  of  the  surface -wave  mode  with  re- 
spect to  the  normal  to  the  wavegmde  surfaces  (i.  e,  , the  x direction), u^  = sin  6^, 


waveemdesur 

w^=  cos  0^,  ttj  = n^/ n and  § = vu^  * ^1  • even  modes,  m = 0,  2,  4,  . . . , whereas 

for  odd  modes,  m = 1,  3,  5,  . . . . Moreover, 


tan  <t)  = g , 


where 

II , for  TE  modes 

(4) 

2 

l/Uj  , for  TM  modes  . 


The  synchronous  angle  0^  is  determined  by  the  transverse  resonance  condition. 
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(6) 


tan  (()  , for  even  modes, 

tan  (TT  ^r—  w ) = tan  (<|)  + m -2  ) = 

^ ° ^ -l/tan<j>,  for  odd  modes. 

We  can  now  proceed  to  determine  the  radiation  pattern  of  the  waveguide. 

In  the  region  z > 0,  the  electromagnetic  field  can  be  expressed  as  a superposi- 
tion of  plane  waves  propagating  at  angles  co  with  respect  to  the  z direction.  Thus,  we 
set 

X cc 

ik_(ux  + wz) 

ilf(x,  z)  = J ® 

- 00 


where 


u = sin  C and  w = cos  c? 


(8) 


and  k2  = '^^2  ~ ”2^o’  determine  the  radiation  pattern  F(u)  we  apply  the  bound- 

ary condition  at  z = 0 and  obtain 


+ pc  -ik^xu 


F(u)  = k,  / A(x)  e du  , 


(9) 


where  A(x)  = \li(x,  0),  with  \lf(x,  z)  as  given  by  Equation  (2).  This  gives, 

-iTTDU 


F(u)  = jcos<t)[_— pr 

sin  [nD(w  - u)3 


iTTDU  - im^psin  [ttD(w  + u)  ] 


W + U 
O 


m 


w - u 
o 


+(-l) 

where  = Ti^/n  , 

"2 

U = --h  u = a-u  , 
n c 

and 

D = nd/X 

is  the  relative  optical  thickness  of  the  waveguide. 
Setting 

\ji  = ttDU  = TiDa^u  , 


(10) 


(11) 


(12) 


(13) 


it  is  readily  found  from  Eq.  (10)  that 
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F (u)  = F 
es'  ' es 


(0)  [cos  ^ sin  t ^[l  + 


2 -2 

^Tir^ 


for  even  TE  modes. 


J-J-  ] • 


(w. 


2 -2 

w„  - ? 


(14a) 


^ (u)  = F (O)r(l  + — y )cos  'ii i|i  sini(i"]/[l  + 5 i) 

ep'  ep'  'LV  (u^ttD)^  ‘ ,1  enD^ 


u CttD 

O • 


(w^?TTD)‘ 


for  even  TM  modes, 


2 -2 
W - c 


(14b) 


F (u)  = F 
os  es 


(0)  [sin  ^ ^ cos  ^]/[l  + — ^ 


1 


2-  2 2 J • 

(w^EttD)*'  ' (w^?TT^D^) 


(14c) 


for  odd  TE  modes,  and 


F (u)  = F . 
op  ' ep 


(0)  [(1  + — 5-)  sin  iji  + — lifcos  t j/[l  + 

(U^TTD)'^^  u:?ttD 


2 -2 
W - c - 
o ,2 

T ^ 


(w^?n"D‘') 


1 ,4  1 

2^2.2  j 


(14d) 


for  odd  TM  modes,  where  F^^(O)  = J ^ ~ ~ ^1^ '^l  '^o"' 


For  the  dominant  modes  (m  = 0),  A(x)  is  positive  for  all  x.  The  absolute  maxi- 
mum of  F(u)  for  these  modes  will  therefore  occur  at  u=  0.  Figure  2 shows  the  radia- 
tion  patterns  {F(u)|  of  the  first-order  TM  modes  for  two  slab  waveguides  of  optical 
thickness  nd  = 0.  20  X and  nd  = 1 . 20  X,  respectively.  Both  waveguides  have  the  same 
refractive  index  ratio  n^/n  = 0.  40  and  are  assumed  to  radiate  into  free  space,  with 
n_/n  = 0.40.  For  comoarison  purposes  we  show  the  corresponding  radiation  patterns 

2 ‘ 3 12 

for  these  waveguides  as  found  by  Angulo  and  Hu  and  Bergstein  ’ using  much  more 
cumbersome  and  involved  methods.  The  agreement  is  almost  perfect.  We  observe 
that  the  waveguide  of  thickness  1.20X  has  a radiation  pattern  which  is  approximately 
four  times  Wider  than  the  pattern  of  the  0.20Xthick  waveguide. 
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RADIATION  PATTERN 
Isf  order  TM  mode 


dielectric  slab  waveguide  of  refractive  index  n=  1. 57  8 immersed  in  a 
medium  of  refractive  index  0^=1.  The  graphs  show  1F(0)|‘^  for  wave- 
guides of  optical  thickness  nd  = 0.  20  X and  1 . 20  X.  The  points  marked 
"0"  correspond  to  the  results  reported  by  Angulo  and  Hu. 

In  Fig.  3 we,  show  the  radiation  patterns  of  the  first-order  TE  modes  for  wave- 
guides of  refractive  index  ratios  n^/n  = n^/n  = 0.  40  and  thicknesses  ndj  = 0.  125X, 

0.625X  and  3.625\  respectively.  Again,  it  is  seen  that  the  radiation  pattern  of  the  0.625\ 
waveguide  is  considerably  wider  than  the  pattern  of  the  0.  125X  waveguide,  and  that  the 
0.  125X  and  3.  625X  waveguides  have  approximately  the  same  radiation  patterns. 

We  define  the  angular  beamwidth  ABW  = 2(Vcf)  of  the  first-order  mode  radiation 

Z 1 z 

patterns  as  the  width  of  angles  for  cp  for  which  1 F(u)  1 > -^  F (o).  Using  Eqs.  (14a) 

and  (14b),  the  half-power  width  V®  can  be  readily  determined  as  a function  of  the 
waveguide  thickness  D=  nd/X.  The  results  are  given  in  Fig.  4,  where  2(Vc?)  is  shown 
for  TE  and  TM  modes  as  a function  of  D for  various  values  of  the  refractive  index 
ratio  ttj  = nj/n.  It  is  assxjmed  that  the  waveguide  radiates  into  free  space.  For  com- 
parison purposes,  the  figure  also  shows  the  half-power  width  for  the  limiting  case  of 
n -»  oc  and  n^/n  -»  0,  that  is,  the  width  2(V'('q)  of  a radiation  pattern  which  would  be 
obtained  if  we  were  to  neglect  the  evanescent  waves  associated  with  the  propagating 
modes.  The  field  distribution  for  the  first-order  modes  is  then  given  by 

1cos(TTx/d)  , for  - ^ 5,  ^ ^ > 


0 


otherwise 


Fig.  3.  Radiation  patterns  |F('‘)|  for  the  first-order  TE  modes  of  a terminated 
dielectric  slab  waveguide  of  refractive  index  n = 1.578  immersed  in  a 
medium  of  refractive  index  n,  = 1.  The  graphs  show  [F(C)j  for  wave- 
guides of  optical  thickness  nd  = 0.  125X,  0.  625X  and  3. 625X. 

resulting  in  a radiation  pattern 

F^(u)  = cos(ttDu)/L1  - (2Du)^  ] , 

and  a half-power  width 

2(VcCq)  = 2sin‘^  (0.  594X/n2d)  = 2sin'^0.594/a2D)  (15] 

The  results  are  rather  interesting.  They  show  that  the  width  of  the  radiation 
pattern  first  increases  with  increasing  waveguide  thickness,  reaching  a maximtim 
when  the  optical  thickness  nd  of  the  waveguide  is  in  the  order  of  a wavelength,  and 
then  decreasing  as  the  waveguide  thickness  further  decreases,  approaching  the  value 
of  2(Vcp^)  as  given  by  Equation  (15).  As  an  example,  it  is  seen  that  when  n = 1. 578 
and  nj  = 1. 00,  the  width  of  the  TM  radiation  pattern  of  a waveguide  of  thickness 
nd  s:  0.  25X  is  the  same  as  that  of  a waveguide  with  nd  = 4.  5X;  the  TM  pattern  width 
of  a waveguide  with  nd  « 0.  125X  is  the  same  as  that  of  a slab  width  nd  lOX.  Corres 
ponding  results  are  obtained  for  other  refractive  index  ratios.  Of  course,  it  is  not 
hard  to  find  a physical  explanation  for  these  results. 
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ANGULAR  BEAMWIOTH 


Fig.  4.  Angular  beamwidth  2(Vcci)  of  the  first-order  mode  radiation  patterns  of 
terminated  dielectric  slab  waveguides  as  a function  of  the  optical  thick- 
ness nd/X  of  the  waveguide.  The  graphs  show  2(V'P)  for  various  values 
of  the  refractive  index  ratio  nj/n.  Also  shown  in  2(V5Pq)  for  the  limiting 
case  n -» <»  or  n^/n  -*  0. 

When  the  waveguide  thickness  is  small,  the  synchronous  angle  6 of  the  surface - 
wave  modes  is  close  to  the  critical  angle  6c  ~ ^1^  of  total  reflection.  As  a 

result,  the  attenuation  factor  ? = ysin6^^''(n^7*i)^  fo*'  evanescent  waves  outside 
the  waveguide  is  small  and  the  evanescent  "tail"  is  large,  making  the  waveguide  look 
much  wider  and  the  radiation  pattern  narrower.  As  the  waveguide  thickness  increases, 
the  synchronous  angle  increases  and  so  does  the  attenuation  factor  This  results  in 
a shorter  evanescent  tail  and  in  a decrease  in  the  apparent  waveguide  thickness. 
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Finally,  as  the  waveguide  thickness  further  increases  the  effect  of  the  evanescent 
waves  becomes  negligible  and  the  width  of  the  radiation  pattern  decreases  with  in- 
creasing thickness  nd. 

To  further  compare  our  results  with  those  available  in  the  literature,  we  show 

in  Fig.  5 the  radiation  patterns  (as  given  by  Eq.  (14d))of  the  second-order  TM  modes 

RADIATION  PATTERN 
2nd  order  TM  mode 


vnd  = l.200X 


/ l-l  \ 


Rnd  = 0,750X 


0 30“  60“  90“ 

Fig.  5.  Radiation  patterns  |F(Cfl^  for  the  second-order  TM  modes  of  a terminated 
dielectric  slab  waveguide  of  refractive  index  n = 1. 578  immersed  in  a 
medium  of  refractive  index  n^  = 1.  The  graphs  show  |F(cd)|2  for  wave- 
guides of  optical  thickness  nd  = 0.75X  and  1.20X.  The  points  marked  "0" 
correspond  to  the  results  reported  by  Angulo  and  Hu. 

for  two  slab  waveguides  of  optical  thickness  nd  = 0.75X  and  nd  = 1 . 20X  respectively, 

3 2 

immersed  in  and  radiating  into  free  space.  The  results  found  by  Angulo  and  Hu 
are  also  shown  in  the  figure.  For  the  0.  75X  waveguide  there  is  perfect  agreement 
between  our  results  and  those  of  Angulo  and  Hu,  whereas  for  the  I.  20X  waveguide 
there  is  a discrepancy  in  the  results  at  the  tail  end  of  the  radiation  pattern.  The 
radiation  patterns  of  these  waveguides  for  the  seond-order  TE  modes  are  shown  in 
Fig.  6.  We  again  observe  that  the  TE  and  TM  radiation  patterns  of  the  1.20X  wave- 
guide are  considerably  wider  than  the  corresponding  patterns  of  the  0.  75X  waveguide 


no 


2 

Fig.  6.  Radiation  patterns  1f{c5)1^  for  the  second-order  TE  modes  of  a terminated 
dielectric  slab  waveguide  of  refractive  index  n = 1.578  immersed  in  a 
medium  of  refractive  index  = 1.  The  graphs  show  |F(cp)|2  for  wave- 
guides of  optical  thickness  nd  = 0.75X  and  1.20X. 


Joint  Services  Technical  Advisory  Committee 

F44620-74-C-0056  L.  Bergstein 

REFERENCES 

1.  E-W.  Hu  and  L.  Bergstein,  "Analysis  of  Surface  Waveguide  Discontinuities," 
Progress  Report  No.  41  to  J.STAC,  Polytech.  Inst,  of  New  York,  Report  No.  R- 
452.41-76,  pp.  22-26  (1976). 

2.  E-W.  Hu,  "Optical  Waveguide  Discontinuities,"  Ph.D.  Dissertation,  Polytech. 
Inst,  of  New  York  (June  1976). 


3.  C.  Angulo,  "Diffraction  of  Surface  Waves  by  a Semi- Infinite  Dielectric  Slab,  " Ph.D. 
Dissertation,  Polytech.  Inst,  of  New  York  (June  1955). 


OPTICS 


111 


POLARIZATION  INDEPENDENT  DIELECTRIC  CONSTANT  PROFILES 
L.  Bergstein 

A.  Introduction 

An  electromagnetic  field  propagating  in  linear,  homogeneous  and  isotropic  media 
can  be  decomposed  into  two  mutually  independent  constituent  fields,  one  which  is  trans- 
verse electric  (TE)  and  one  which  is  transverse  magnetic  (TM)  with  respect  to  an  arbi- 
trarily chosen  direction  (of  propagation).^  Although  the  electric  and  magnetic  fields 
associated  with  these  modes  are  distinct,  the  respective  intensity  distributions  are  iden- 
tical and  the  TE  and  TM  signals  propagate  with  the  same  group  velocity. 

A TE-TM  decomposition  is  also  possible  under  certain  conditions  in  inhomogeneous 
media,  as  for  example  when  the  variation  of  the  dielectric  constant  and/or  the  magnetic 
permeability  is  one-dimensional.  In  the  inhomogeneous  case,  however,  the  intensity 
distributions  of  the  two  fields  are  usually  distinct  and  the  TE  and  TM  signals  propagate 
with  different  group  velocities.  We  investigate  here  the  question  whether  separable 
inhomogeneity  profiles  can  be  found  for  which  the  TE  and  TM  modes  have  identical  in- 
tensity distributions  in  direction  of  propagation  and  propagate  with  the  same  group  ve- 
locity. A waveguide  made  of  such  an  inhomogeneous  medium  can  thus  support  a non- 
polarized  signal.  We  have  earlier  solved  the  problem  for  the  case  of  plane- stratified 
and  rotationally  symmetric  cylindrical  media.  We  now  consider  the  general  case  and 
derive  a set  of  necessary  and  sufficient  conditions  for  the  polarization  independent 
propagation  of  non-coupled  TE  and  TM  modes.  As  an  example,  we  determine  the  polar- 
ization independent  dielectric  constant  profiles  for  various  geometries. 

B.  Polarization  Independent  Propagation 

The  relation  between  the  electric  field  E and  magnetic  field  H of  an  electromagnetic 
wave  of  radian  frequency  oj  propagating  in  a charge- free  region  of  a linear  and  isotropic 
medium  is  given  by  two  of  Maxwell's  equations, 

VxE=iu)UH  , 

(1) 

V xH  = -io)  £ E , 

where  i is  the  imaginary  unit,  £ is  the  dielectric  constant  of  the  medium,  fj,  is  its  mag- 
netic permeability,  and  V is  the  vector  differential  operator  (with  V x A = curl  A, 

V ■ A = div  A and  = grad  i|j).  The  time  dependent  factor  exp(-iio  t)  is  self- under  stood 
and  is  left  out. 

Maxwell's  equations  can  be  simplified  when  the  variation  of  the  dielectric  constant 
£ and/or  magnetic  permeability  fj,  is  only  two  dimensional.  We  choose  the  direction  in 
which  there  is  no  variation  of  the  media  parameters  as  the  n direction  and  decompose 
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the  electric  and  fields  into  transverse  and  longitudinal  components  with  respect  to  that 
direction.  Thus,  we  set,  • 

E = E_  + u E and  H = H_  + u H , (2) 

~ -T  -n  n - -T  -n  n 


where  u^  is  the  unit  vector  in  the  n direction,  and  the  subscripts  T and  n stand  for 
■"transverse"  and  "normal"  (i.e.,  longitudinal)  respectively.  Moreover,  since  Maxwell's 
equations  are  linear,  we  can  express  the  field  as  a superposition  of  two  fields,  one 
which  is  transverse  electric  with  respect  to  the  n direction,  with  = 0,  and  one  which 
is  transverse  magnetic,  with  = 0.  It_is  then  readily  found  that  for  the  transverse 
electric  field,  E'  = E!lj,, 


H'  = r-i—  u xV 
n 1 CO  (X  -n 


(3a) 


where  V 
n 


is  the  n-component  of  the  V operator  (for  curl  operation)  and  E'  is  given  by 


V^E' 


E'  +-^xVxE' 
^ - H 


V(V  • E')  = 0 


with 


{4a) 


V-(liE’)=0  and  E'  • (^  - ^)  = 0 (5a) 

t fj. 

Correspondingly,  it  is  found  that  for  the  transverse  magnetic  field,  H”  = 

E"=-^^(VxH")  , E"=-^^u  xV  • H"  , (3b) 

- T 1 u)  e ' n - ' ' n i e -'n  - ' ' ' 

H”  being  given  by 

V^H"  + co^fi  £ H"  xV  xH"  - V{V  • H")  = 0 , (4b) 


with 

V-(eH")  = 0 and  H ■ (^  - —)  = 0 . (5b) 

The  power  flow  in  the  direction  of  propagation  is  given  by  Poynting's  vector 
P = E.j,xH^  , (6) 

with  E.J,  = E)j,  + E;1^  and  H.j,  = H|!p  + H)p.  For  media  for  which  Poynting's  vector  can  be 
expressed  as  a sum  of  two  terms,  one  involving  only  the  TE  field  components  and  one 
involving  only  the  TM  components,  the  TE  and  TM  modes  are  mutually  independent. 
For  such  "separable"  distributions  we  have,  with  P = P.j,g  + 


F 
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Clearly  then,  in  separable  distributions  the  TE  and  TM  signals  will  have  the  same  in- 
tensity distributions  and  propagate  with  the  same  group  velocity  if,  and  only  if. 


= constant 


or,  equivalently,  if  /VT  and  E^,  satisfy  the  same  wave  equation. 

We  limit  our  considerations  here  to  dielectric  media,  when  there  is  no  spatial 
variation  in  the  magnetic  permeability  ft.  We  then  find  that 


E'  = 0 


V • H"  = 0 


Ve  = 0 


H"  . Ve  =0 


and  the  wave  equations  for  E'  and  H"  reduce  to 
V^E'  + oj^ftE  E'  = 0 


V^H"  + u^fiE  H"  +^xVxH"  = 0 , 


respectively.  Polarization  independence  of  the  signal  will  now  result  when  the  wave 
equations  for  H"  /VT  is  identical  to  the  wave  equation  for  E'.  Rewriting  Eq.  (10b)  in 
terms  of  H"/V^  = F.  we  obtain 

V^F  + u^fiE  F +-^  jve  xVxF- VxVe  xF-  1.5 fJ  = 0 . (] 

Obviously,  this  equation  will  reduce  to  the  same  form  as  Eq.  (lOa)  and  result  in 
F = H''/Vr  = E'  if 

2 

Ve  xVxF  - VxVe  xF  - 1.5 F = 0 . (1 


Inhomogeneous  dielectrics  for  which  Eqs.  (9)  and  (12)  can  be  satisfied  will  give  rise  to 
polarization  independent  propagation  by  non-coupled  TE  and  TM  fields. 

When  the  spatial  variation  of  the  dielectric  constant  can  be  expressed  in  terms  of 
a one-dimensional  variable,  Eqs.  (9)  and  (12)  can  be  reduced  to  explicit  necessary 
and  sufficient  conditions  for  polarization  independent  propagation  of  non-coupled  TE 
and  TM  modes . 

Consider  an  orthogonal,  curvilinear  coordinate  system  [4j,42»^3]  with  scale 

factors  [h  ,h  h ] given  by  h^  = (9x/a|,)^  + (3y/a4.)^  + (dz/di.)^,  [x,y,z]  being  the 
^ ^ ^ } J J J 

Cartesian  coordinates.  Let  c be  a function  of  one  coordinate  only.  Since  the  numbering 
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o£  the  coordinate  system  is  arbitrary,  we  choose  this  to  be  the  coordinate,  that  is, 

^ (C^)-  Moreover,  we  choose  the  direction  of  propagation  to  coincide  with  the  direc- 

tion of  4^.  that  is,  we  set  £^=0  and  H^^O.  It  is  then  found  from  Eqs  . (9 ) that  for  non- 
coupled  TE  and  TM  modes, 


e:  = [o.E2.o]  . 

H"=[0,H2,0]  , 

(13) 

’ aj^  ^^i^3‘^2^  ' ° ' 

(14) 

and  Eqs.  (1)  and  (12)  reduces  to  the  requirement  that  dE’/d^  = 0,  = 0, 


9hj  dh^ 

with  e(4j)  given  by 


3 1 

— /-i = 0 

> t,  at  ^ ^ ' 


2 '"'2'  - 
ah. 


' 9^2'4®^1 


H2'h^9e3 


) = 0 


(15) 


d^(i/vr ) 


2,2 


d‘-4 


‘'l‘^2  d 


d4i  'h^h^ 


dd/JT)  _ 


= 0 


(16) 


Equation  (15)  specifies  the  coordinate  systems  and  field  polarizations  for  which  polari- 
zation independent  propagation  of  non-coupled  TE  and  TM  fields  is  possible;  the  di- 
electric constant  profile  e(4^)  which  can  support  such  fields  is  given  by  Equation  (16). 

For  the  plane- stratified  case  we  choose  a cartesian  coordinate  system  (x,y,z)  and 
assume  an  electromagnetic  wave  propagating  in  the  z direction.  With  [x,y,z]  = [4^,42>^3] 
and[hj^  = l,h2  = l,h2  = l],  we  find  from  Eqs . (15)  and  (16)  that  the  medium  will  support 
polarization  independent  propagation  of  non-coupled  TE  and  TM  modes,  with  E„_  = 
[O.Ey.O]  and  = [0,H^,0],  if  there  is  no  field  variation  in  the  y direction  and 


€M=€(0)(^^)^  . (17) 

a being  an  arbitrary  constant. 

For  rotationally  symmetric  cylindrical  media,  we  consider  a waveguide  with  a 
medium  having  a dielectric  constant  t(r)  varying  in  the  radial  direction  of  a cylindrical 
coordinate  system  (r,(t>,  z).  Assuming  a wave  propagating  in  the  z direction,  we  have 
[*■,<(),  z]  = [4j,C2»^3]^"‘‘^  [hj  = l,h2  = r,h2  = l],  and  Eqs . (15)  and  (16)  show  that  the  wave- 
guide will  support  polarization  independent  non-coupled  TE  and  TM  modes,  with  = 

[0,  E^,  O]  and  H = [0,  0],  if  the  field  is  rotationally  symmetric  (i.e.,  d /d  <li  = 0) 

and 


OPTICS 


115 


a being  an  arbitrary  constant. 

We  observe  that  for  practical  optical  communication  fibers,  the  dielectric  con- 
stant profile  as  given  by  Eq.  (18)  closely  approximates  the  parabolic  distribution  E(r)  = 

2 2 

£(0)(1  - a r ).  Consequently,  also  in  parabolic  fibers,  the  propagation  of  rotationally 
symmetric  TE  and  TM  signals  is  nearly  polarization  independent.  A comparison  of  the 
properties  of  the  distribution  as  given  by  Eq . (18)  with  the  parabolic  distribution  will  be 
presented  elsewhere. 


We  next  consider  a medium  with  a dielectric  constant  e = e(6)  varying  in  the  0 

direction  of  a spherical  coordinate  system  (r,0,<j>).  For  a wave  propagating  in  the 

radial  direction,  we  have  [r,  0,  <j)]  = ^1  ~ ^2  ~ will 

have  polarization  independent  propagation  of  non- coupled  TE  and  TM  modes,  with 

E„^  = [0,O,E  ] and  H_.  - = [0,0,H,],  if  there  is  no  field  variation  in  the  <()  direction  and 
Xr-  <p  i M <p 


E(V) 


_£iOL 


(1  + A cos  0) 


A being  an  arbitrary  constant. 


(19) 


The  electromagnetic  field  of  an  infinity- strip  laser  resonator  with  parabolic  end 

reflectors  can  be  conveniently  described  in  an  elliptic  cylinder  coordinate  system 

(u,  V,  z),  where  1<  u<  «>,  - l<v^  +1,  -oo  < z < <»,  surfaces  u = constant  are  elliptic 
2 2 2 2 2 

cylinders  x /u  + y /(u  - 1)  = a , a being  a scale  factor,  surfaces  v = constant  are 

2 2 2 Z Z 

hyperbolic  cylinders  x /v  - y /(I  - v ) = a , and  surfaces  z = constant  are  planes.  Ac- 
cording to  Eq.  (15),  polarization  independent  propagation  of  non-coupled  TE  and  TM 
modes  if  possible  iu  such  a coordinate  system  if  = [0,0,  E^],  = [0,0,  H^],  and 

the  dielectric  constant  e varies  with  either  the  u or  the  v coordinate.  Assuming  that 
E = £(u),  we  have  [u,  v,  z]  = [|^,  iy  and 


^hj  = a-Jiu^- v^)/(u^-  1)  , h^  = aV(u^  - v^)/(l  - v^)  , h2  “ ^ ’ 


and  the  polarization  independent  profile  is  given  by 


E(u)  = 


_£iOL 


E(0) 


(1+Acosh  ^u)^  [1  + A f n(u  +<^u^  - 1)]^ 

A being  an  arbitrary  constant.  For  values  of  u close  to  unity  Eq.  (20)  reduces  to 

e(u)  = e(x,  y)  = E(0)((a^  - x^)/ (Va^  - x^  + A | y | ^]  . 


(20) 
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LEAKAGE  AND  RESONANCE  EFFECTS  ON  STRIP  WAVEGUIDES  FOR  INTEGRATED 
OPTICS 

S.  T,  Peng  and  A.  A.  Oliner 
A.  Introduction 

The  basic  strip  waveguide  for  integrated  optics  which  is  under  consideration  is 
shown  in  Figure  1,  As  seen,  this  waveguide  is  not  planar,  but  is  a three-dimensional 
structure  which  confines  the  wave  horizontally  as  well  as  vertically.  The  strip  produces 
a central  region  with  an  effective  refractive  index  greater  than  that  of  the  surrounding 
regions;  as  a result,  the  surface  wave  that  is  present  on  the  thin  film  is  pulled  in  by 
the  strip  region  and  confined  to  its  neighborhood.  The  guiding  process  is  usually  viewed 
in  terms  of  waves  in  the  strip  region  bouncing  back  and  forth  at  an  angle  between  the 
two  sides  of  the  strip,  undergoing  total  reflection  at  each  bounce. 


Fig.  1.  The  optical  strip  waveguide. 

Theoretical  analyses  for  the  propagation  characteristics  of  the  strip  waveguide 

1-4 

have  appeared  in  the  literature,  but  these  analyses  assume  the  presence  of  only  one 
mode  type,  TE  or  TM,  in  each  of  the  transverse  regions  comprising  the  cross  section 
of  the  strip  waveguide.  When  the  field  behavior  at  the  strip  sides  is  viewed  more  care- 
fully, however,  it  is  easy  to  see  that  a TE  or  TM  mode  incident  on  a strip  side  produces 
not  only  a reflected  and  transmitted  wave  of  its  own  type,  but  also  excites  a reflected 
and  a transmitted  wave  of  the  other  type,  plus  a continuous  spectrum  (which  is  purely 
reactive  here).  This  coupling  at  the  strip  sides  between  TE  and  TM  modes  produces  the 
interesting  propagation  effects  described  here. 

To  show  why  such  cross-coupling  must  occur,  consider  first  a TE  mode  propaga- 
ting along  an  infinitely -wide  planar  structure  identical  to  that  in  the  strip  region.  If  the 
wave  is  propagating  in  the  z direction,  it  has  only  the  following  field  components;  H^, 
and  E^.  If,  however,  the  wave  travels  at  a small  angle  with  respect  to  the  z direc- 
tion, additional  small  components  and  E^  arise.  Similarly,  a TM  mode  traveling  at 
a small  angle  in  that  configuration  will  possess  not  only  components  E^,  E^  and  H^,  but 

also  small  amounts  of  E and  H . When  such  a TE  wave  bounces  back  and  forth  in  the 

X z 

actual  strip  region,  it  hits  the  strip  sides  at  a small  angle  with  respect  to  the  z direction; 
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the  small  and  field  components  then  excite  a TM  mode  at  the  strip  sides.  Simi- 
larly, the  small  E^  and  components  ol  a i incident  TM  mode  will  excite  a TE  mode 
there.  Thus,  TE-TM  mode  coupling  necessarily  occurs  at  these  strip  sides,  and  this 
small  coupling  gives  rise  to  interesting  cor.'-equences  under  suitable  circumstances. 


B.  Qualitative  Explanation  of  Leakage  Effc  ts 

In  order  to  obtain  a simple  qualitative  understanding  of  the  effects  of  TE-TM  mode 
coupling  on  the  guidance  behavior,  let  us  cc  isider  a special  case  of  the  strip  waveguide, 
for  which  the  strip  material  is  the  same  as  hat  of  the  thin  film.  Such  a structure, 
referred  to  as  a "rib  waveguide"  by  Bell  La  )oratories,  is  shown  in  the  inset  in  Figure 
2.  The  dispersion  curves  presented  in  Fig.  2 apply  to  the  constituent  regions  compris- 
ing the  waveguide  cross  section,  as  if  each  /ere  infinitely  wide.  The  TE  mode  is  seen 
to  be  the  dominant  mode,  with  the  curve  for  the  other  polarization  (TM)  located  nearby. 
The  vertical  line  t^  corresponds  to  the  (thic  ;er)  strip  region,  and  the  t^  line  to  the  film 
region. 


Fig.  2.  Dispersion  curves  for  the  constituent  waves 
in  the  strip  and  fil  n regions. 

For  complete  guiding,  the  wave  or  waves  inside  the  strip  region  must  be  above 
cutoff  (propagating)  and  the  wave  or  waves  outside  must  be  below  cutoff  (evanescent). 
Suppose  that  the  basic  wave  is  TE,  shown  as  a circle  in  Fig.  2 at  thicKness  tj,  in  the 
strip  region.  That  point  corresponds  to  the  highest  value  of  and  is  therefore  the 

slowest  constituent  wave.  For  guidance,  the  TE  mode  at  thickness  t^,  outside,  must 
be  evanescent.  Thus,  the  TM  modes  which  are  excited  in  both  the  tj  and  t^  regions  are 
seen  to  be  waves  that  are  faster  than  the  TE  mode  outside,  and  they  are  therefore  cer- 
tainly evanescent.  Hence,  for  the  lowest  TE  mode  nothing  interesting  happens;  the 
wave  is  still  purely  bound,  although  the  numerical  value  of  the  propagation  constant 
changes  a bit  due  to  the  coupling. 

When  the  basic  mode  is  TM,  however,  we  consider  the  square  in  Fig.  2 at  thick- 
ness tj.  For  guiding,  that  constituent  wave  must  be  propagating  and  the  TM  wave  outside 
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(at  t^)  must  be  evanescent.  Thus,  if  the  TI'-TE  mode  coupling  were  neglected,  the 
wave  would  be  completely  bound.  The  TE  modes  which  are  excited  at  the  strip  sides 
are,  however,  seen  from  Fig.  2 to  be  slowt  r than  the  TM  wave  inside  the  strip,  and 
thus  to  be  above  cutoff  both  inside  the  strip  ;tj)  and  outside  in  the  film  region  (t^). 

This  situation  is  depicted  in  Figure  3.  A TM  mode  incident  from  the  inside  excites 
a reflected  propagating  TM  mode  inside,  an  evanescent  TM  mode  outside,  and  propaga- 
ting TE  modes  both  outside  and  inside.  Thi  propagating  TE  mode  outside  carries  away 
a small  amount  of  energy  at  each  reflection  resulting  in  a leaky  wave  rather  than  a 
purely  bound  wave. 


P'ig.  3.  TE-TM  coi  pling  in  the  "TM" 
case,  show  ng  leakage. 

It  is  shown  later  that  the  propagating  'E  mode  inside  the  strip  region,  since  it  is 
slower  than  the  TM  mode  there,  produces  resonance  effects. 

Depending  on  how  close  the  resonanc  curves  are  to  each  other,  the  TE  mode  out- 
side may  or  may  not  be  above  cutoff.  Thus  the  TM  mode  may  or  may  not  be  leaky, 
depending  on  the  properties  of  the  waveguide,  but  leakage  will  occur  for  many  of  the 
strip  waveguides  being  considered  today.  Similar  considerations  apply  to  all  higher 
modes,  both  TE  and  TM;  leakage  will  occur  under  appropriate  conditions. 

C.  Numerical  Examples 

Numerical  examples  are  important  in  order  to  determine  whether  the  leakage  de- 
scribed above  is  negligible  or  significant.  In  the  examples  discussed  below  it  is  signi- 
ficant. 

Let  us  consider  the  structure  shown  in  the  inset  in  Fig.  4,  where  the  dimensions 
and  refractive  indices  are  indicated.  Calculations  were  performed  taking  the  TE-TM 
mode  coupling  into  account  correctly,  assuming  that  only  the  dominant  modes  of  each 
type  can  propagate  in  the  constituent  regions  (valid  for  the  film  thicknesses  chosen),  and 
neglecting  the  continuous  spectrum.  It  is  found  that  the  dispersion  curves  {n^££  vs.  W/\) 
are  affected  very  little  by  the  coupling.  No  curves  are  shown  here,  therefore. 

The  curve  of  attenuation  vs.  W/\,  on  the  other  hand,  is  very  interesting,  and  is 
shown  in  Figure  4.  It  should  first  be  recognized  that  if  the  TE-TM  coupling  did  not 
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Fig.  4.  Attenuation,  due  to  TF-TM  coupling,  as  a function 
of  strip  width  for  the  strip  waveguide. 

occur  there  would  be  no  attenuation,  since  we  are  neglecting  material  losses.  There- 
fore, all  the  attenuation  shown  is  due  to  leakaae. 

The  numerical  values  for  a are  of  the  order  of  the  measured  results  (including 
material  losses),  50  that  the  leakage  effect  is  significant.  The  leakage  is  seen  to  be 
greater  for  greater  rib  height,  as  expected  since  th.»n  a larger  step  discontinuity  occurs 
at  the  strip  side.  The  leakage  also  decreases  as  the  strip  width  W increases.  This  oc- 
curs because,  as  W increases,  the  constituent  waves  inside  the  strip  region  approach 
closer  to  glancing  incidence,  and  the  coupled  components  decrease  in  amplitude. 

In  addition,  it  is  evident  from  Fig.  4 that  a resonance  (or  a cancellation)  effect  is 
occurring.  It  is  due  to  the  TE  mode  inside  which  is  bouncing  back  and  forth  above  cut- 
off. From  a simple  calculation,  we  find  that  to  within  1%; 


W = 2 ir  for  the  first  resonance 


W = 4 Tr  for  the  second  resonance 


We  have  also  derived  an  equivalent  network  for  the  step  junction  discontinuity  corre- 
sponding to  the  strip  side,  and  we  can  readily  explain  the  resonance  behavior  using  this 
network. 

Similar  considerations  apply  to  other  waveguiding  structures,  including  the  slot 

5-9 

waveguide,  consisting  of  an  uncoated  strip  region  between  metal-coated  regions.  We 
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have  performed  a similar  analysis  for  a typical  slot  waveguide,  and  the  numerical  re- 
sults show  that  the  resonance  effect  dramatically  reduces  not  only  any  leakage  produced, 
but  also  the  attenuation  due  to  the  presence  of  the  metal. 

The  leakage  and  resonance  properties  have  evident  implications  for  device  per- 
formance when  TM  polarization  is  used.  For  example,  the  properties  of  directional 
couplers,  which  consist  of  two  waveguides  placed  parallel  to  each  other  over  a specified 
length,  could  be  influenced  in  important  ways.  Because  of  the  leakage,  the  coupled 
waveguides  need  not  be  as  near  to  each  other  for  the  same  coupling  length.  Additional 
flexibility  is  available  since  the  leakage  can  be  avoided  altogether  by  making  the  strip 
side  sufficiently  high  (see  Figure  2).  Then,  the  film  thickness  outside  of  the  coupled 
region  can  be  made  small  so  as  to  avoid  leakage  outward,  but  the  film  thickness  in  the 
region  between  the  two  strips  would  be  made  greater  to  permit  leakage  to  occur  and  thus 
to  enhance  the  coupling  between  the  strips.  We  are  in  the  process  of  examining  the 
effects  of  leakage  on  such  directional  coupler  performance. 
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A NOVEL  LEAKY- WAVE  STRIP  WAVEGUIDE  DIRECTIONAL  COUPLER  FOR 
INTEGRATED  OPTICS 

E-W.  Hu,  S.  T.  Peng  and  A.  A.  Oliner 
A.  Introduction 

A new  type  of  directional  coupler  for  integrated  optics  is  presented  here  which 
looks  like  the  ordinary  strip  waveguide  type  of  directional  coupler  but  which  operates 
on  a different  coupling  mechanism.  As  shown  in  Fig.  1,  the  coupler  consists  of  two 
strip  waveguides  placed  on  a thin  film  on  a substrate,  where  the  two  strips  couple  to 
each  other  over  a given  length.  However,  its  operation  allows  a greater  separation 
between  the  two  strips  than  is  present  in  the  customary  coupler. 


Fig.  1.  Cross  section  of  the  directional  coupler 
consisting  of  two  strips,  spaced  s apart, 
on  a thin  film  on  a substrate. 

In  the  ordinary  type  of  strip  waveguide  coupler,  each  strip  lies  in  the  exponential- 
ly-decaying transverse  field  of  the  other  strip,  and  the  coupling  which  occurs  is  due  to 
these  fields.  The  coupling  is  a very  sensitive  function  of  the  separation  (s  in  Fig.  1) 
between  the  strips,  and  the  coupling  length,  say  the  distance  along  the  strips  for  com- 
plete transfer  of  power,  is  therefore  sensitively  dependent  on  this  separation.  If  the 
separation  is  too  great,  of  course,  the  coupling  becomes  negligible  and  the  coupling 
length  becomes  excessive. 

The  new  type  of  directional  coupler  proposed  here  depends  on  a new  phenomenon, 
leakage  between  the  strips  which  arises  under  appropriate  design  conditions.  Because 
of  this  leakage,  the  coupling  is  no  longer  as  sensitive  to  the  separation  s,  and  the 
coupling  length  can  be  adjusted  to  the  desired  length,  within  limits,  by  controlling  the 
amount  of  leakage.  Since  the  separation  s need  no  longer  be  very  small,  and  since  the 
coupling  length  is  less  sensitively  dependent  on  it,  this  new  type  of  coupler  relaxes  the 
fabrication  tolerances  in  certain  ways. 

As  will  be  shown  below,  leakage  occurs  only  for  the  mode  for  which  the  electric 
field  is  vertical.  That  mode  corresponds  to  basic  TM  excitation  (all  the  modes  are 
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actually  hybrid,  of  course).  The  basic  TE  mode  does  not  leak.  The  directional  coupler 
is  therefore  restricted  in  its  operation  to  the  TM-type  mode.  On  the  other  hand,  this 
restriction  can  be  a virtue  since  it  permits  coupling  for  the  TM-type  mode  but  not  for 
the  TE-type  mode  if  the  separation  s is  sufficiently  large  (a  half-dozen  wavelengths  or 
so,  depending  on  the  parameters). 

In  the  discussions  which  follow,  the  origin  of  the  leakage  is  explained,  the  opera- 
tion of  the  coupler  is  described,  and  numerical  results  obtained  from  a theoretical 
analysis  are  presented  for  a specific  example. 

B.  Explanation  of  the  Leakage  Effect 

Let  us  consider  first  the  propagation  behavior  of  a single  isolated  strip  on  the 
film  and  substrate.  The  guiding  process  is  usually  viewed  in  terms  of  waves  in  the 
strip  region  bouncing  back  and  forth  at  an  angle  between  the  two  sides  of  the  strip,  un- 
dergoing total  reflection  at  each  bounce.  The  approximate  theoretical  analyses  which 

1-4 

have  appeared  in  the  literature  assume  the  presence  of  only  one  mode  type,  TE  or 
TM,  in  each  of  the  transverse  regions  comprising  the  cross  section  of  the  strip  wave- 
guide. When  the  field  behavior  at  the  strip  sides  is  viewed  more  carefully,  however, 
it  is  easy  to  see  that  a TE  or  TM  mode  incident  on  a strip  side  produces  not  only  a 
reflected  and  transmitted  wave  of  its  own  type,  but  also  excites  a reflected  and  trans- 
mitted wave  of  the  other  type,  plus  a continuous  spectrum  (which  is  purely  reactive 
here).  It  is  this  small  coupling  at  the  strip  sides  between  TE  and  TM  waves  that  pro- 
duces the  leakage  which  forms  the  basis  for  the  novel  directional  coupler. 

This  leakage  effect  was  recognized  and  described  recently  by  two  of  the  present 
authors.^  All  of  the  previously-published  analyses  miss  this  effect  entirely. 

A discussion  of  the  leakage  effect,  and  a qualitative  explanation  regarding  when 
it  occurs  and  how  it  manifests  itself,  are  included  in  the  report  immediately  preceding 
this  one.  That  explanation  made  use  of  a special  case  of  the  strip  waveguide,  for  which 
the  strip  material  is  the  same  as  that  of  the  thin  film.  Such  a structure  has  been  called 
the  "rib  waveguide"  by  the  Bell  Laboratories  in  their  studies. 

It  is  shown  in  the  preceding  report^  that  a TM  mode  incident  on  a strip  side  from 
the  inside  of  the  strip  excites  a reflected  propagating  TM  mode  inside,  an  evanescent 
TM  mode  outside,  and  propagating  TE  modes  both  outside  and  inside.  The  propagating 
TE  mode  outside  carries  away  a small  amount  of  energy  at  each  reflection,  resulting 
in  a leaky  wave  rather  than  a purely  bound  wave.  Depending  on  how  close  the  resonance 
curves  of  the  constituent  regions  are  to  each  other,  the  TE  mode  outside  may  or  may 
not  be  above  cutoff.  Thus,  the  TM  mode  may  or  may  not  be  leaky,  depending  on  the 
properties  of  the  waveguide. 
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C.  The  Operation  of  the  Leaky- Wave  Coupler 

We  may  now  return  to  the  directional  coupler  configuration  of  Figure  1.  When 
one  of  the  two  strip  waveguides  is  excited  in  the  basic  TE  mode  (vertical  magnetic  field), 
that  waveguide  does  not  leak,  and  the  other  strip  waveguide  senses  only  its  transversely 
evanescent  field.  The  directional  coupler  configuration  then  operates  in  standard  fashion, 
so  that  negligible  coupling  between  the  strips  would  occur  if  the  strip  separation  s were 
made  sufficiently  large. 

On  the  other  hand,  when  the  strip  is  excited  in  the  basic  TM  mode  (vertical  elec- 
tric field),  the  wave  may  leak  if  the  geometrical  parameters  and  refractive  indices  are 
selected  appropriately.  The  field  components  present  in  the  film  region  under  these 
conditions  of  leakage  consist  of  the  transversely  evanescent  TM  wave  and  the  propagat- 
ing TE  wave,  the  latter  possessing  small  amplitude.  Actually,  the  propagating  TE 
wave  also  slowly  increases  exponentially  in  the  transverse  (x)  direction,  consistent 
with  the  small  exponential  decay  in  the  longitudinal  (z)  direction  of  the  leaky  guided 
wave,  but  this  property  affects  the  performance  negligibly.  When  the  strip  separation 
s is  large,  the  other  strip  no  longer  senses  the  evanescent  TM  field  but  it  couples  to 
the  transversely  propagating  TE  wave  component. 

The  performance  of  the  coupler  was  analyzed  (taking  both  the  TE  and  TM  compo- 
nents into  account)  by  determining  the  propagation  characteristics  of  the  symmetric  and 
the  antisymmetric  modes  of  the  overall  structure  comprised  of  both  strips.  The  sym- 
metric and  antisymmetric  modes  behave  slightly  differently  even  when  the  separation  s 
is  large  because  the  leaky  TE  wave  component  reacts  differently  to  the  electric  or 
magnetic  wall  nature  of  the  midplane  at  x = 0. 

Numerical  values  for  a typical  special  case  are  shown  in  Fig.  2 for  both  basic  TE 
and  basic  TM  excitation.  Plotted  are  n^jj(  = p/k=:  k/X^)  as  a function  of  s/X  for  both  the 
symmetric  and  antisymmetric  modes.  It  is  seen  that  for  the  TE  basic  wave,  which 
does  not  leak,  the  behavior  is  the  customary  one.  For  the  leaky  TM  basic  wave,  one 
notes  that  the  coupling  continues  even  as  the  sep>aration  s becomes  large.  In  fact,  the 
symmetric  and  antisymmetric  mode  contributions  interlace  with  each  other,  and  the 
regions  of  maximum  spacing  between  them  (which  occur  periodically)  are  relatively 
flat. 

The  coupling  length  L,  which  can  be  defined  as  the  length  corresponding  to  com- 
plete power  transfer,  is  given  by  tt  / j p - p (,  where  p and  p are  the  propagation 

S 2L  8 21 

wavenumbers  of  the  symmetric  and  antisymmetric  modes,  respectively.  For  a strip 
separation  s = 12X,  one  finds  Ls  2500X,  which  is  a desirable  length.  Furthermore,  one 
sees  from  Fig.  2 that  a small  fabrication  error  in  the  value  of  s will  negligibly  affect 
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Fig.  Z.  Coupling  properties  of  the  leaky- wave  directional  coupler. 

Shown  are  ng££(  = p/k  = K/\g)  as  a function  of  s/X.,  where  s 
is  the  strip  separation,  for  both  symmetric  and  antisym- 
metric modes  on  the  coupler  structure,  for  both  basic  TE 
and  TM  excitations. 

the  value  of  L,  so  that  this  type  of  coupler  should  permit  looser  tolerances  in  some 
parameters . 

In  the  structure  of  Fig.  1,  leakage  also  occurs  in  the  film  regions  outside  the 
strips.  If  such  leakage  is  undesirable,  it  can  be  eliminated  by  making  the  film  thicker 
in  the  region  between  the  strips  than  in  the  outside  regions.  Then  one  can  achieve  the 
desired  leakage  between  the  strips  and  yet  avoid  leakage  outward.  The  basis  for  this 
statement  follows  from  the  discussion  in  the  preceding  report. 

Joint  Services  Technical  Advisory  Committee 
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PHOTOLITHOGRAPHIC  LIFTOFF  OF  RF-SPUTTERED  DIELECTRIC  FILMS,  WITH 
APPLICATION  TO  INTEGRATED  OPTICS 

E-W.  Hu 
A.  Introduction 

The  photolithographic  lift-off  technique  was  reported^  to  generate  one  micrometer 
line-width  metallic  interdigital  electrode  patterns  with  high  yield  for  surface  acoustic 
wave  devices.  For  a successful  lift-off,  it  is  essential  to  produce  a sharp  vertical 
profile  in  the  photoresist.  In  addition,  it  was  also  reported^  that  the  film  should  pref- 
erably be  prepared  by  electron  beam  evaporation  in  which  the  evaporated  atoms  to  be 
deposited  on  the  substrate  arrive  at  the  substrate  at  near- normal  incidence  such  that 
the  film  on  the  sides  of  the  photoresist  relief  patterns  is  considerably  thinner  than  the 
film  on  the  substrate  and  on  the  top  of  the  photoresist.  In  this  way,  the  organic  solvent 
can  have  access  to  the  photoresist  after  film  deposition  to  allow  the  photoresist,  along 
with  the  films  on  top  of  it,  to  be  lifted  off  or  removed. 

2 

Besides  the  application  of  the  lift-off  technique  to  metallic  films.  Wolf  demon- 
strated its  applicability  to  ion  beam  sputter-deposited  glass  films  on  patterns  generated 

by  a scanning  electron  microscope  for  the  construction  of  optical  waveguides.  However, 

2 

for  some  unknown  reasons,  the  edges  of  the  waveguides  thus  fabricated  were  not  smooth 
and  sharp.  As  to  rf- sputtered  films,  it  is  often  believed  that  the  sputtered  atoms  or 
ions  possess  poor  directionality  and  hence  make  the  lift-off  difficult.  It  was  briefly 
reported  that,  even  in  the  case  of  very  thin  metal  films  (300-500  R of  Chromium),  the 
unwanted  metal  films  could  not  be  thoroughly  removed.^  However,  it  seems  that  no 
further  detailed  work  along  this  line  has  been  reported  in  the  literature, 

Rf- sputtering  has  been  extensively  used  for  making  thin  films  for  various  applica- 
tions. It  seems  particularly  attractive  for  the  preparation  of  low- loss  films  for  inte- 
grated optics  applications.  Despite  the  complex  plasma  processes  associated  with 
sputtering,  the  important  sputtering  parameters  can  be  accurately  controlled  to  yield 
reproducible  results.  Furthermore,  deposition  parameters  such  as  gas  content  and 
sputtering  powers  can  be  readily  adjusted  over  a wide  range.  This,  combined  with  a 
relatively  slow  deposition  rate,  offers  the  possibility  of  depositing  films  with  accurate 
control  of  thickness  and  of  variable  refractive  index. 

In  this  work,  we  report  our  successful  application  of  the  photolithographic  lift-off 
technique  to  rf- sputtered  thin  dielectric  films  and  to  the  fabrication  of  optical  strip- line 
waveguiding  structures.^ 

The  optical  strip- line  waveguide  is  one  of  the  most  important  components  in  inte- 
grated optics.  It  consists,  as  shown  in  Fig.  1 , of  a basic  thin  film  waveguide  loaded 
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Fig.  1.  Optical  strip- line  waveguiding  structure: 

n2>n3>  n4  or  n2^ni  > n3>  04,  where 
ni,n2,n3  and  n4  are  the  refractive  indices 
of  film,  strip- line,  substrate  and  air, 
respectively. 

by  a dielectric  strip  of  finite  width  W.  In  the  strip  region,  the  guided  surface  wave  is 
slower  tlian  that  in  the  other  regions.  As  a result,  the  surface  wave  energy  becomes 
i trapped  in  the  vicinity  of  the  strip  and  can  therefore  be  guided  one-dimensionally . 

This  may  provide  a method  for  direct  coupling  to  an  optical  fiber  because  the  wave  is 
confined  transversely  in  both  directions  and  offers  other  advantages  over  the  two- 
dimensional  (planar)  simple  waveguide.  Also,  by  setting  n^  > n^  (Fig.  1),  most  of  the 
energy  carried  by  the  surface  wave  is  confined  in  the  thin  film  beneath  the  strip;  there- 
fore, the  requirement  for  very  smooth  edges  for  low-loss  waveguiding  is  less  stringent 
than  that  for  other  types  of  linear  optical  waveguide.  Moreover,  the  optical  strip  line 

is  the  building  block  for  more  sophisticated  integrated  optical  devices,  such  as  modula- 
4 5 

tors,  directional  couplers  and  switches,  etc.  Currently,  there  are  several  techniques 

for  making  optical  strip  line  and  similar  structures.^  However,  except  for  the  one  that 

2 

was  mentioned  before,  most  of  these  techniques  make  use  of  either  ion  beam  or  rf- 
etchings;  the  procedures  involved  were  often  tedious  and  complex.  As  will  be  discussed 
later,  the  significance  of  our  current  approach  is  that  it  readily  provides  an  alternate 
and  a simple  way  for  the  fabrication  of  optical  strip- line  structures.  The  experimental 
procedures  are  discussed  in  detail  in  the  following  section. 

B.  Experiments 

The  main  steps  of  the  fabrication  process  are  illustrated  in  Fig.  2 and  are  dis- 
cussed below.  As  shown  in  Fig.  2(a),  the  basic  surface  waveguide  is  first  made  by 
rf- sputtering  7059  glass  or  an  Al^O^  film  on  a pre-cleaned  l"x3"  microscope  slide. 

The  thickness  and  refractive  index  of  the  film  were  measured  by  the  prism- coupler 
method.  The  facilities  and  procedures  for  film  preparation  and  characterization  were 
discussed  in  our  previous  work.  Both  the  refractive  index  and  the  thickness  of  the  film 
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Fig.  2.  Main  steps  of  photolithographic  lift-off  technique 
for  the  fabrication  of  optical  strip- line  waveguide 
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were  accurately  controlled  such  that  the  film  guided  only  the  dominant  TE  and  TM 
modes.  Following  the  film  deposition,  the  sample  was  immediately  transferred  to  the 
laminar  hood  for  photolithographic  processing.  Care  was  exercised  to  avoid  airborne 
dusts  which  would,  later  on,  prevent  intimate  contact  from  forming  between  the  photo- 
mask and  the  photoresist.  This  is  especially  critical  for  patterns  with  micrometer  or 
submicrometer  dimensions.  Shipley  AZ1350B  photoresist  was  chosen  in  this  work  for 

g 

the  availability  of  most  of  its  processing  information  through  previous  work  done  here. 
Standard  photolithographic  procedures  including  spin-coating,  baking,  exposing  and 
subsequent  developing  of  the  photoresist  were  performed  (Figures  2(b)  and  2(c)).  A 
collimated  ultra-violet  exposure  beam  of  uniform  intensity  was  derived  from  a 200  W 
high  pressure  Hg  light  bulb.  Intimate  contact  between  the  photomask  and  the  sample 
was  checked  by  observing  interference  fringes.^  With  intimate  contact  between  mask 
and  sample,  the  exposure  and  development  time  were  found  to  be  not  critical.^  There- 
fore, the  sample  was  always  over-exposed  and  over-developed  to  insure  the  complete 
removal  of  the  exposed  photoresists. 

After  development  of  the  photoresists,  the  sample  was  baked  at  50°C  for  20  min- 
utes in  a vented  oven.  The  sample  was  then  transferred  to  the  vacuum  chamber  for 
strip-line  film  deposition.  Since  the  softening  point  of  the  photoresist  is  rather  low 
(below  150°C),  no  pre-etch  of  the  substrate  was  done.  The  sputter-deposition  of  the 
strip  (Fig.  2(d))  was  performed  with  rf- sputtering  powers  below  100  W for  most  cases. 
Although  the  base-plate  of  the  rf- system  employed  in  this  work  was  water  cooled,  it 
was  found  that  for  sputtering  powers  higher  than  100  W,  deformation  and  even  blistering 
of  the  photoresists  would  occur. 

Following  the  strip  deposition,  the  unwanted  photoresists  and  films  were  lifted 
off  by  dipping  the  sample  in  acetone  for  about  20  minutes  (Fig.  2(e)).  Gentle  swabbing 
of  the  sample  by  an  acetone- saturated  cotton  Q-tip  was  occasionally  needed  to  remove 
the  residual  photoresist,  depending  on  the  resist- to-film  thickness  ratio.  For  a 4:1 
ratio,  no  difficulty  was  encountered  in  the  lift-off  procedure.  Figure  3(a)  shows  the 
photomicrograph  of  the  photoresist  relief  pattern  of  an  array  of  optical  strip  lines  with 
resist  thickness  of  about  10,  000  X.  Figure  3(b)  shows  the  actual  glass  strip-line  struc- 
ture after  lift-off.  The  strip  height  in  Fig.  3(b)  is  3,000^  and  the  strip  width  7.5fX. 

The  separation  distance  between  strips  is  5 jn.  Comparison  between  the  photoresist 
relief  pattern  and  the  actual  waveguide  structure  (Figs.  3(a)  and  3(b))  shows  that  they 
are  almost  identical,  implying  the  accurate  mapping  of  the  photomask  patterns  onto 
the  resist  and  the  subsequent  strip-line  structure. 

However,  the  actual  smoothness  of  the  sides  of  the  strips  must  be  checked  with 
a scanning  electron  microscope.  Such  pictures  have  been  gratifying  in  that  they  show 
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Fig.  i.  Photomicrographs  of 

(a)  Photoresist  relief  pattern  of  an  array  of  strip-lines, 
where  the  resist  thickness  is  10, 000  X. 

(b)  Glass  optical  strip- line  waveguide  obtained  from  the 
above  photoresist  relief  pattern  by  the  lift-off  tech- 
nique, where  the  strip  width  = 7 . 5 p,  the  separation 
distance  between  strips  = 5 u and  the  strip  height  = 

3,000  K. 

very  sharp  and  steep  strip  sides.  Some  inadequacies  still  appear,  however,  and  they 
will  require  further  investigation.  In  some  places,  ripples  appear  on  the  uncoated 
glass  substrate,  indicating  that  the  substrate  should  be  optically  polished  before  use. 
Sometimes,  the  top  edge  of  the  strip  side  is  rounded  somewhat;  it  is  believed  that  this 
is  due  to  the  swabbing  by  the  Q-tip,  but  this  point  will  be  checked  further.  Finally, 
although  the  strip  sides  are  satisfactorily  vertical,  they  ripple  a bit  along  their  lengths 
It  is  known  that  this  ripple  is  due  to  the  mask  used,  which  was  kindly  furnished  to  us 
by  the  MIT  Lincoln  Laboratory.  Although  certain  improvements  are  called  for,  there- 


A chromium-coated  thin  glass  (Corning  0211)  comformable  photomask  with  inter- 
digital  transducer  patterns  originally  designed  for  surface  acoustic  wave  device  applica 
tions. 
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fore,  the  results  so  far  are  very  encouraging,  and  they  do  indicate  the  success  of  the 
method  in  producing  the  required  verticality  in  the  strip  sides. 

C.  Conclusions 

We  have  found  that  the  lift-off  technique  can  be  used  successfully  with  rf- sputter- 
ed dielectric  films.  Our  method  therefore  provides  a convenient  way  to  fabricate  opti- 
cal strip  lines  and  similar  structures  for  integrated  optics  applications.  Even  though 
we  used  a photomask  pattern  with  a line  width  of  a few  micrometers,  the  results  imply 
that  the  dimensions  could  be  reduced  to  a micrometer  or  less.  This  is  because  the 
resolution  of  the  line  width  is  limited  by  the  photolithographic  technique  rather  than  by 
the  lift-off  technique.  As  has  been  reported  in  a review  article  by  Smith,'^  the  comform- 
able  photomasking  technique  could  yield  well-defined  lines  as  narrow  as  a few  thousand 
angstroms  in  photoresists  with  near  vertical  profile  and  very  little  undercut. 

As  discussed  in  the  previous  section,  the  ratio  of  the  thickness  of  the  photoresist 
to  that  of  the  film  should  be  approximately  4:  1 for  easy  lift-off.  For  Shipley  AZ1350B 
resist  used  in  this  work,  it  is  difficult  to  obtain  uniform  resist  layers  thicker  than  about 
1.6  pm.  This  limits  the  film  to  be  lifted  off  to  about  4,000.8.  For  thicker  films  or 
thicker  strips,  other  types  of  photoresist  must  be  used.  Fortunately,  there  are  a num-  . 
ber  of  commercially-available  high- resolution  photoresists  which  could  be  deposited  as 
thick  as  a few  micrometers. 

Besides  the  optical  strip- line  structure  discussed  in  this  work,  the  fabrication 
procedures  should  also  be  valuable  in  making  optical  grating  structures.  So  far,  it 
seems  that  most  of  the  gratings  used  as  input/output  beam  couplers  for  integrated  optics 
applications  were  made  by  recording  the  laser-beam  interference  patterns  in  photo- 
resists  which  are  quite  lossy  at  optical  frequencies.  Because  of  the  lossy  nature  of  the 
photoresists,  the  difficulty  in  precisely  controlling  the  grating  profile,  and  other 
problems  associated  with  holographical  grating  generation,  it  seems  that  there  is  still 
a poor  correlation  between  theory  and  actual  device  performance  of  the  grating  couplers. 
Since  almost  all  the  important  fabrication  parameters  of  our  current  approach  could 
be  accurately  controlled,  and  the  periodic  grating  structure  could  be  made  of  low- loss 
dielectric  film,  our  experimental  procedures  may  provide  a simple  means  for  a 
systematic  experimental  study  of  the  grating  couplers. 

Currently,  work  is  underway  using  this  approach  to  construct  a specially-designed 
directional  coupler to  verify  a newly- discovered  leakage  phenomenon  associated  with 
the  basic  strip-line  waveguide,^^  and  to  determine  whether  or  not  this  leaky-wave 
directional  coupler  is  indeed  much  less  sensitive  to  dimensional  tolerances.^'' 
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SUPERRADIANT  OUTPUT  FROM  A DISCHARGE- HEATED  COPPER  VAPOR  LASER 
W.  T.  Walter 

The  copper  vapor  laser  is  one  of  the  few  high-power  sources  available  in  the 
blue-green  spectral  region,  near  the  maximum  of  the  water  transmission  band.  Oper- 
ation of  the  copper  vapor  laser  differs  from  most  other  lasers  in  that  a high  tempera- 
ture region  is  required  to  generate  sufficient  vapor  density.  Although  the  high  temper- 
ature presents  some  difficult  materials  problems,  it  also  offers  the  important  advan- 
tage of  allowing  radiation  to  be  used  to  stabilize  the  operating  temperature.  Radiation 
can  remove  the  limitations  imposed  by  thermal  conduction  and  eliminate  the  required 
external  water-cooling  systems  of  present  high-power  systems.  Demonstration  of  a 
practical  discharge-heated  system  is  important  therefore,  for  the  further  development 
of  this  class  of  efficient,  pulsed  gas-discharge  lasers. 

The  first  copper  vapor  lasers^  ^ utilized  furnaces  to  produce  operating  temper- 
atures of  1400-l600°C.  At  Polytechnic  100-kW  peak-power,  2-mJ  pulses  have  been 
demonstrated  with  an  0.  7%-  energy-conversion  efficiency.  More  than  lOW  of  average- 
power  laser  output  has  been  generated  using  an  externally-heated  laser  tube  which 

required  2700W  of  heater  power.  ^ Although  the  operating  temperature  can  be  reduced 

6 -9 

by  using  a compound  source  of  copper,  such  as  a copper  halide,  the  system 
efficiency  is  reduced  because  electrical  discharge  energy  must  be  provided  to  dis- 
sociate several  molecules  for  each  copper  atom  excited  through  the  laser  cycle.  In 
addition  the  operating  regimes  of  copper  halide  systems  are  very  narrow,  particularly 
in  terms  of  tube  temperature  or  density  of  the  copper  halide  vapor.  Our  approach  has 
been  to  develop  an  optimized  discharge-heated  atomic-copper  system  by  increasing 
the  current  density,  discharge  impedence  and  pulse  repetition  frequency  and  controlling 
the  thermal  conductivity  so  that  the  energy  dissipated  in  the  discharge  will  provide  the 
optimum  tube  temperature.  Recycling  electrodes  which  utilize  capillary  action  to 
confine  the  metal  vapor  are  also  being  examined.  Under  these  conditions  the  metal 
vapor  laser  is  self-heated.  No  external  furnace  is  required. 

Previously  a discharge -heated  copper  vapor  laser^  had  been  demonstrated  at 
Polytechnic.  It  operated  for  short  periods  of  time  however,  with  an  average  power 
output  of  only  0.  1 W.  An  improved  discharge-heated  copper  vapor  laser  has  now  been 
constructed  which  can  be  operated  for  many  hours.  Tungsten  mesh  electrodes  were 
spaced  55  cm  apart  inside  a 12  mm  i.  d.  boron  nitride  3-section  tube  containing  ISgms 
of  copper  shot.  The  initial  measurements  reveal  significant  differences  with  the  laser 
output  from  an  externally -heated  copper  vapor  laser.  The  operating  mode  of  the  dis- 
charge-heated device  appears  to  be  much  closer  to  that  of  a superradiant,  super- 
fluorescent  or  amplified  spontaneous  emission  generator.  The  presence  or  absence 
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of  mirrors  forming  an  optical  resonator  has  a much  smaller  effect  on  tlie  output,  and 
tile  average  output  power  is  much  lower  than  that  from  an  externally- heated  copper 
vapor  laser.  A 3-nsec  pulse  is  present  in  the  output  of  tlie  discharge-heated  laser 
without  any  mirrors  as  indicated  in  the  upper  trace  of  Figure  1.  The  addition  of 
mirrors  does  not  change  the  size  of  this  pulse  but  appears  to  lengthen  it  by  the  addition 
of  several  subsequent  pulses  of  similar  amplitude.  The  middle  and  upper  traces  of 
Fig.  1 show  the  effect  of  adding  first  the  far  mirror  and  then  both  mirrors.  Thus 
addition  of  mirrors  changes  the  output  power  by  a relatively  small  factor  (2  to  4).  The 
average  output  power  has  been  less  than  -0.5  watt  thus  far.  The  externally-heated 
devices,  on  the  other  hand,  have  very  small  superradiant  outputs.  When  mirrors  are 
added  the  output  power  increases  by  several  factors  of  ten,  and  average  power  outputs 
as  high  as  11  W liave  been  obtained.^ 


[ 


I 

I 

Fig . 1 . Output  at  5105  X of  discharge-heated  copper  vapor 
I laser  with  10  torr  argon  operating  at  5680  pps. 

1 Detection  system  consisting  of  TRC  105  B vacuum 

I photodiode  detector  and  Tektronix  519  oscilloscope 

has  1 GHz  bandwidth  and  risetime  <0.5  nsec. 
Horizontal  scale  is  10  nsec/large  division. 

a)  top  trace  - no  optical  resonator  (some  or  all 

1 of  the  second  peak  appears  to  be  produced  by 

■ diffuse  reflection  back  into  the  copper  gain  tube). 

b)  middle  trace  - Al  plane  mirror  installed  at 
far  end. 

c)  bottom  trace  - optical  resonator  completed 
20%- partially- reflecting  plane  output  mirror. 


Since  the  discharge-heated  configuration  has  a higher  total  system  efficiency  and 
is  a more  practical  device  form,  an  examination  is  being  undertaken  of  the  differences 
in  operation  and  output.  The  aim  is  to  obtain  an  understanding  of  the  various  processes 
involved  as  well  as  to  find  operating  regimes  which  do  not  suffer  from  the  deficiencies 
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revealed  in  these  initial  measurements.  To  compare  the  externally -heated  and  dis- 
charge-heated lasers,  we  must  improve  our  temperature  measuring  procedures.  A 
two-color,  ratio- reading  optical  pyrometer  will  be  examined  for  this  purpose.  The 
hottest  region  of  the  externally -heated  laser  is  the  o.d.  of  the  tube  on  which  the  heater 
is  wound,  while  the  hottest  region  of  the  discharge-heated  laser  is  the  center  of  the 
discharge  tube  near  the  cathode.  In  addition  in  the  discharge-heated  device  the  com- 
position and  pressure  of  additive  gases  is  more  critical  in  keeping  the  impedance  of 
the  discharge  substantially  greater  than  the  impedance  of  the  thyratron  switch. 

Finally  the  question  of  superradiance  or  cooperative  spontaneous  emission 
behavior  is  also  being  examined.  Many  high-power  lasers  also  possess  high  gain 
(aL  > 1)  and  therefore  may  be  subject  to  limitations  imposed  by  superradiant,  super- 
fluorescent  or  cooperative  spontaneous  emission  effects.  In  these  processes  atoms 
are  coupled  together  by  their  common  radiation  field  and  so  radiate  cooperatively. 

The  intensity  emitted  in  the  phased  direction  by  N electric  dipole  moments  is  there- 

2 2 
fore  proportional  to  N instead  of  N.  An  output  intensity  • N requires  that 

aL  > 1 where  a is  the  intensity  gain  coefficient.  According  to  Bonifacio  and  Lugiato'® 

however,  superfluorescence  also  requires  that  the  length  of  the  active  medium  must 

be  longer  than  a threshold  length  a~^  characteristic  of  stimulated  processes  and 

shorter  than  a cooperation  length  characteristic  of  the  exchange  of  energy  between  the 

field  and  the  excited  atoms.  This  latter  requirement  is  disputed  by  MacGillivrey  and 
11  12 

Feld.  Experiments  by  Gibbs  in  cesium  vapor  and  experiments  by  others  in  HF. 

CH^F  and  Na  and  Tf  vapors  have  not  resolved  this  question.  Pulses  generated  by  the 

copper  vapor  laser  without  mirrors  are  very  similar  in  character  and  shape  to  those 

generated  in  cesium  and  other  systems  prepared  to  study  cooperative  emission. effects. 

Cooperative  emission  behavior  can  be  examined  using  the  copper  vapor  laser  which  has 

14  -3, 

the  twin  advantages  of  being  able  to  generate  larger  inversion  densities  (up  to  10  cm  '] 
and  having  its  output  in  the  visible  spectral  region  where  detection  systems  are  more 
sensitive  and  have  wider  bandwidths.  A wider  dynamic  range  is  therefore  available  to 
examine  these  effects  and  determine  their  limitations  on  high-power  efficient  lasers. 
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COMPUTER  MODELING  AND  EXAMINATION  OF  EXCITATION  RATES  FOR  A 
BISMUTH  VAPOR  LASER  AT  4722  R 

W.  T.  Walter  andN.  Solimene 

Laser  action  at  412zR  in  bismuth  vapor  was  predicted  to  be  a member  of  a class 
of  efficient,  pulsed  gas -discharge  lasers.  Laser  action  at  this  wavelength  would  be 
very  close  to  the  center  of  the  water  transmission  band.  Although  a number  of  labo- 
ratories have  examined  discharges  in  bismuth  vapor,  laser  action  has  not  yet  been  re- 
ported.^ Bismuth  vapor  has  a substantial  fraction  { - 50%)  of  dimers,  Bi2  molecules, 

which  could  prevent  laser  action  by  means  of  several  absorption  processes.  Two 

2 3 

methods  of  dissociating  the  dimers,  thermal  and  double-pulse  discharge,  have  been 
experimentally  explored  without  achieving  laser  action.  The  spontaneous  emission 
lifetime  of  the  472zR  6p^7s  ^^3/2  been  critically  evalu- 

ated^ (t  = 190  nsec).  Construction  of  several  low -inductance  dischrage  configurations 
which  produced  excitation  current  pulse  risetimes,  as  low  as  25  nsec  t) 

were  also  unsuccessful  in  producing  laser  action.  ^ 

5 

Williams,  Trajmar  and  Bozinis  (WTB)  have  carried  out  an  experimental  deter- 

4 

mination  of  electron  impact  cross  sections  for  excitation  of  both  the  upper  I’j^2 
lower  ^Dj^2  proposed  bismuth  laser  levels.  On  the  basis  of  their  measurements  for 
40eV  incident  electrons  Trajmar  and  Williams  suggested^  that  the  electron  impact 
cross  sections  are  small,  are  about  the  same  for  upper  and  lower  levels  and  do  not 
provide  a basis  for  an  efficient  production  of  a propulation  inversion.  With  the  aid  of 
the  computer  model  developed  at  Polytechnic  for  the  dynamics  of  pulsed  metal-vapor 
lasers,  we  have  examined  the  possibility  of  laser  action  at  4722A  in  bismuth  vapor  and 
find  that  the  WTB  measurements  actually  predict  the  production  of  high-power,  effi- 
cient laser  output.  We  shall  now  show  that  the  present  state  of  knowledge  of  electron 

excitation  cross  sections  is  entirely  consistent  with  the  generation  of  strong  laser 
o 

action  at  4722A  and  provides  no  basis  to  account  for  its  absence. 

A computer  modeling  approach  is  required  for  an  analysis  of  potential  laser 
action  at  4722A  because  of  the  large  number  of  excitation  and  relaxation  channels  as 
well  as  the  large  number  of  bismuth  atomic  energy  levels  involved.  The  model  in- 
cludes rate  equations  for  the  population  of  as  many  as  15  energy  levels  of  the  laser 
species  which  may  be  directly  or  indicrectly  involved  in  the  development  of  laser 
action.  Also  included  are  rate  equations  for  the  population  of  an  additional  species 
which  is  modeled  by  two  energy  levels  --  ground  state  atomic  and  ionic  levels.  This 
species,  usually  argon,  is  present  to  act  as  a buffer  gas  and  to  facilitate  electrical 
breakdown  and  discharge. 
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Primary  excitation  in  the  model  is  provided  by  electron  collisions.  Relaxation 
processes  are  electron  de -exciting  collisions,  radiative  relaxation  and  diffusion  to  and 
de -excitation  at  the  walls  of  the  container.  The  rate  constants  for  transitions  between 
the  various  energy  levels  and  for  ionization  depend  on  the  cross  sections  for  these  pro- 
cesses and  on  the  electron  nximber  density  and  energy  distribution.  The  electron  en- 
ergy distribution  is  assumed  to  have  a Maxwell-Boltzmann  distribution  throughout  the 
discharge  pulse,  but  the  average  energy  changes  as  determined  by  an  energy  balance 
equation.  Since  in  most  cases  the  cross  sections  for  ionization  or  for  excitation  are 
not  known,  they  have  been  estimated  using  the  semiclassical  formulas  of  Gryzinski. 

The  computer  model  was  applied  to  the  situation  is  bismuth  using  the  lowest  15 

+ 8 

atomic  energy  levels  and  the  ground  level  of  Bi  from  Moore's  tabulation  and  the 

9 

transition  probabilities  of  Corliss  and  Bozman.  Radiation  trapping  was  included. 

Where  values  of  radiation  transition  probabilities  had  not  been  measured,  a wall  col- 

3 - 1 

lision  relaxation  rate  of  10  sec  was  assumed.  Figure  1 shows  a computer  print- 
out for  the  time  development  of  the  absolute  magnitude  of  the  capacitor  voltage  (V), 

o 

the  discharge- tube  current  density  (I)  and  the  blue  laser  output  anticipated  at  47  22A(*). 

The  computer  print-out  models  the  situation  where  a 2000-pF  capacitor  charged  to 

15  kV  has  suddenly  been  applied  to  a discharge  tube  2.  5 cm  diameter  with  a hot  zone 

25  cm  long  containing  10^^  Bi  atoms/cm^.  A circuit  inductance  of  l|iH  has  been 

assumed  as  well  as  a laser  cavity  time  constant  of  3.3  nsec.  To  facilitate  identifica- 

o 

tion  of  the  expected  laser  output  at  4722A,  a solid  line  has  been  drawn  through  the 
computer  generated  points  (*).  The  vertical  full  scale  values  are  15  kV  voltage, 

100  amperes/cm  current  density  and  50  kW  laser  output  power.  The  full  scale  hori- 
zontal value  is  100  nanoseconds. 

It  is  evident  from  Fig.  1 that  in  spite  of  the  competing  processes  within  the 

bismuth  atom,  the  computer  model  predicts  that  a population  inversion  will  be  ob- 
o 

tained  in  the  4722A  transition  under  conditions  which  simulate  our  experimental  appa- 
ratus. The  peak  intensity  of  the  expected  laser  pulse  is  45  kW  and  full  width  at  haK 
maximum  intensity  ~90  nsec. 

Now  let  us  compare  the  WTB  experimentally-determined  electron-impact  cross 
sections^  with  the  calculated  Gryzinski  cross  sections  used  thus  far  in  the  model. 
Unfortunately  the  WTB  measurements  were  carried  out  only  at  an  electron  energy  of 
40eV,  whereas  production  of  a population  inversion  in  a pulsed  discharge  will  be  de- 
termined by  the  cross  section  values  at  much  lower  electron  energies.  Nevertheless 
a comparison  of  the  Gryzinski  calcuation  with  the  WTB  measurement  may  still  be 
illustrative.  At  40eV  our  Gryzinski  calculation  of  the  electron  excitation  cross  sec- 

4 

tions  of  the  proposed  upper  laser  level  ( P.  <_)  was  approximately  one-half  that  of 
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the  WTB  value,  and  our  Gryzinski  estimate  of  the  cross  section  for  electron  excita- 

2 

tion  of  the  lower  laser  level  { ^^^2)  approximately  l/30  that  of  the  WTB  value. 

If  we  consider  our  Gryzinskii  results  to  be  relative  excitation  curves  and  use  the  WTB 
values  to  normalize  them  at  40eV ; that  is,  the  upper  laser  level  excitation  rates  are 
increased  by  a factor  of  2 and  the  lower  level  excitation  rates  are  increased  by  a fac- 
tor of  30  throughout  the  entire  electron  energy  range,  the  resulting  computer  model 
predictions  are  shown  in  Figure  2. 

Figure  2 indicates  that  a strong  laser  output  would  still  be  expected  under  these 

conditions.  Although  the  expected  peak  laser  output  decreased  from  45  to  30  kW  as  a 

result  of  the  change  in  electron  excitation  cross  sections,  it  is  still  stronger  than  the 

modeling  results  for  the  copper  laser  output  under  the  same  conditions.  There  is  no 

justification,  therefore,  for  the  statements  ”...  only  a small  population  inversion  can 

5 

be  obtained  with  40  eV  electrons”  or  "...  no  efficient  population  inversion  by  electron 
impact  can  be  expected.  The  ratio  of  the  WTB  values  for  upper  and  lower  excita- 
tion cross  sections,  which  is  13:1,  is  entirely  consistent  with  production  of  a substan- 
tial population  inversion.  In  fact  the  WTB  results  for  bismuth  are  quite  similar  to 
their  values  for  copper  and  lead,  and  both  of  these  vapors  have  furnished  strong 
pulsed  laser  action.  In  bismuth  WTB  normalized  their  electron  excitation  cross  sec- 
tions to  Lvov's  value^^  for  the  3067^  resonance  transition  oscillator  strength.  A 
critical  evaluation‘s  has  shown  that  Lvov's  f value  is  too  low  by  a factor  of  2.  There- 
fore the  WTB  values^  as  reported  for  electron  excitation  of  both  the  upper  ‘Sp  , and 
2 

of  the  lower  bismuth  should  each  be  increased  by  a factor  of  2.  On  the 

other  hand  the  Williams -Trajmar  (WT)  reported  values for  electron  excitation  of 
the  upper  ^3^2  lower  copper  have  been  reported*  to  be  too  large 

by  at  least  a factor  of  7.  After  these  changes  are  made,  the  corrected  values  for  in- 
tegral electron  excitation  cross  sections  of  the  upper  and  lower  laser  levels  in  copper, 
lead  and  bismuth  are  compared  in  Table  I. 

Table  I indicates  that  the  electron  excitation  cross  section  of  the  resonance 
levels  of  the  three  metals  Cu,  Pb  and  Bi  are  quite  comparable  at  40  eV.  Excitation 
cross  sections  of  the  metastable  lower  laser  levels  are  considerably  smaller  for  all 
three  metals  at  40 eV.  Although  the  metastable  level  excitation  cross  section  is 
larger  for  Bi  than  for  Cu  or  Pb,  it  is  still  so  much  smaller  than  the  resonance  level 
excitation  cross  section  that  it  may  be  neglected  in  a first  order  analysis.  For  ex- 
ample if  the  pulsed  dischrage  consisted  entirely  of  40  eV  electrons,  of  the  Bi  atoms 
excited  to  one  of  these  two  levels  93%  would  be  in  the  upper  laser  level. 

Again  it  should  be  emphasized  that  it  is  the  electron  excitation  cross  sections 
at  lower  electron  energies  (1  - lOeV)  which  will  determine  the  direct  production  of  a 
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Fig.  Z.  Computer  modeling  of  a pulsed  discharge  in  bismuth  vapor  where  the 
capautor  voltage  (V),  discharge  tube  current  density  (I)  and  expected 
4722A  laser  output  (*)  are  shown  as  a function  of  time.  Gryzinski 
cross  sections  ^ve  been  normalized  to  WTB  values  at  40 eV. 
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TABLE  1.  Coinparison  of  integral  electron  excitation  cross  sections 

(in  units  of  10'^®  cm^)  for  the  upper  and  lower  laser  levels 
in  atomic  copper,  lead  and  bismuth  using  corrected  WTB^ 
and  WT^®»  values  as  described  in  the  text. 


Upper  Laser  Levels 

Incident  Electron  Energy 

20  eV 

( 1 0 cm  ) 

40  eV 

/in-l6  2. 

(10  cm  ) 

60  eV 

(10'^^  cm^) 

^3/2,  1/2^ 

11.0 

5.2 

Pb(^Pj) 

8.4 

7.  6 

Lower  Laser  Levels 

0.26 

0.051 

Ph(^D^) 

0.  050 

0.  58 

population  inversion  at  4722A  in  a pulsed  discharge  in  bismuth.  The  present  state  of 

4 2 

knowledge  of  electron  excitation  cross  sections  of  the  upper  ^^^2  lower 

levels  is  entirely  consistent  with  the  generation  of  strong  laser  action  at  4722A  and 

provides  no  basis  to  account  for  its  absence.^  The  most  likely  explanation  for  the 

o 

difficiilty  in  achieving  laser  action  at  4722A  in  bismuth  vapor  is  competing  processes 

2 

within  the  bismuth  atom.  In  particular  the  Corliss -Bozman  value  for  the  6p  7s 
4. 


14  15 

transition  probability  is  at  least  a factor  of  20  too  low.  ' Con- 


■’’3/2  ^ Sn  2 

tinuation  of  the  computer  modeling  may  indicate  whether  excitation  of  the  6p  7s 

4 

^3/2  5/2  competing  excitation  channels  which  hinder  laser  action  by 

strong  spontaneous  emission  to  the  6p^  metastable  level. 
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^After  submission  of  this  report,  weak  stimulated  emission  at  4722^  was  reported 
by  S.  V.  Markova,  G.G.  Petrash  and  V.M.  Cherezov  [Sov.  J.  Quantum  Electron., 
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A POSSIBLE  EXPLANATION  FOR  THE  ABSENCE*  OF  LASER  ACTION  IN  BISMUTH 
VAPOR  AT  4722 

W.T.  Walter 

1 * 

Four  possible  explanations  have  been  suggested  for  the  absence  of  laser  action 
at  4722^  in  pulsed  discharges  in  bismuth  vapor; 

2 3 

(1)  Insufficiently-fast  risetime  of  the  excitation  current  pulse,  ’ 

2 4 

(2)  The  presence  of  dimers,  Bi^  molecules,  in  the  vapor,  ’ 

(3)  Unfavorable  ratio  of  electron  excitation  cross  section  for  the  resonance 
and  meta stable  levels,^  and 

(4)  Competing  processes:  such  as,  excitation  to  levels  other  than  the 
resonance  level  and  cascade  filling  of  the  metastable,  proposed 
lower  laser  level. 

In  the  references  listed  we  have  already  shown  that  it  is  unlikely  that  any  of  the  first 
three  explanations  can  account  for  the  absence  of  laser  action. 

Thus  far  explanation  (4),  competing  processes  with  the  Bi  atom  also  has  not  pro- 
vided an  answer.  The  computer  modeling  results  displayed  in  Figs.  1 and  2 of  the  pre- 
ceding article^  indicate  that  utilization  of  Corliss- Bozman  transition  probabilities  to 
model  the  rates  of  competing  electron  excitation  and  radiative  relaxation  processes  does 
not  substantially  reduce  the  population  inversion  expected.  The  possibility  of  significant 
errors  in  the  Corliss- Bozman  transition  probabilities  must  be  considered.  Transition 
probabilities  of  some  25,  000  classified  lines  in  70  elements  have  been  calculated  by 
Corliss  and  Bozman  from  the  intensity  tables  of  Meggers,  Corliss  and  Scribner.  Be- 
cause the  Corliss- Bozman  values  remain  the  only  consistent  set  of  transition  probabil- 
ities spanning  the  stronger  spectral  lines  of  most  of  the  solid  elements,  they  are 
extremely  useful.  Substantial  errors,  however,  have  been  reported  in  several  of  the 
Corliss- Bozman  values,  some  as  large  as  a factor  of  20.  Therefore,  one  is  reluctant 
to  rely  very  strongly  on  these  values.  We  shall  now  critically  examine  these  Corliss- 
Bozman  values  for  bismuth. 

As  indicated  in  Fig.  1,  the  direct  laser  excitation  channel  is  electron  excitation 
of  the  306  7.8  ^^l/2  basis  of  LS  coupling,  the  strengths  of 

competing  excitation  channels  ^S^y^  ““  ^^3/2  and  5/2  expected  to  be  2/3  and 

l/3  respectively  of  that  of  the  direct  laser  excitation  channel.  The  Corliss- Bozman  gA 

g 

values  can  be  converted  to  Condon- Shortley  line  strengths  by 

S(J,  J')  = 4.936x  lO^X^gA  . (1) 

3 4 2 4 

The  relative  Corliss- Bozman  line  strengths  for  the  6p  ^3/2  ““  ^l/2,  3/2,5/2 


)4 
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BISMUTH 


Fig.  1.  Proposed  pulsed  laser  transition  at  4722^  in  bismuth 
vapor  indicated  on  a partial  energy  level  diagram  of 
bismuth.  Figures  in  parentheses  are  Corliss- Bozman^ 
transition  probabilities  in  units  of  10°  sec"^. 

multiplet  then  are;  1:  .008;  .26.  Compared  with  the  LS  coupling  line  strength  of 

4 4 

1;  .667;  .333,  we  see  that  the  strength  of  the  ^3/2  times  weak- 

er than  expected. 


When  the  Corliss- Bozman  values  for  the  similar  transitions  are  examined  in 


antimony  and  arsenic,  the  elements  above  bismuth  in  column  VA  of  the  periodic  table, 
4 4 

the  ^3/2  multiplet  is  not  observed  to  be  substantially  weaker 

than  the  other  two  components.  For  Sb  the  Corliss- Bozman  relative  line  strengths  for 
3 4 2 4 

the  5p  ^l/2  3/2  5/2  are;  1:  1.39:  3.  For  As  the  Corliss- 

Bozman  relative  line  strengths  for  the  4p^  ■“ 

1:  .67;  (not  measured). 


^3/2  ■*  5s  Pi/2,  3/2.  5/2  ®re; 


2 3 

Recently  intermediate  coupling  calculations  have  been  carried  out  for  6p  7s  — * 6p 
9 10 

transitions  in  bismuth.  ’ The  transition  probabilities  calculated  depend  on  the  local 
exchange  approximation  as  well  as  on  the  dipole  length  or  dipole  velocity  forms  of  the 
transition  probability  operator  since  the  self-consistent  field  wavefunctions  are  not 
exact.  The  difference  between  the  dipole  length  and  dipole  velocity  calculations  can  be 
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viewed  as  an  indication  of  the  inexactness  of  the  wavefunctions . Line  strengths  can  be 
calculated  from  these  intermediate  coupling  transition  probabilities  by  Equation  (1). 
They  are  listed  in  Table  I for  comparison  with  the  Corliss- Bozman  line  strengths  and 
the  LS  coupling  relative  line  strengths. 

TABLE  I.  Comparison  of  line  strengths  for  the  resonance  multiplets  of  arsenic, 
antimony  and  bismuth.  ScB  calculated  from  Corliss- Bozman's 
values  in  Reference  6.  Spy  and  Sjjl  calculated  from  Holmgren's 
values  in  Ref.  9 for  dipole  velocity  and  dipole  length  approximations. 
SkM  calculated  from  Kunisz  and  Mizdalek's  values  in  Reference  10. 


^®3/2  ^ ^^1/2 

4 4 

®3/2  P3/2 

^^3/2  ^ ^^5/2 

LS  Coupling 

Relative  S 

1.000 

0.667 

0.333 

As(4p^  -►  4p^5s) 

1972.x 

1937  X 

1890  X 

\ 

®CB 

1.89 

1. 26 

-- 

®DV 

.848 

1.64 

2.  30 

®DL 

1.51 

2.84 

4.00 

Sb(5p^  5p^6s) 

2312.x 

2176X 

2069  X 

\ 

®CB 

.915 

1. 27 

2.  75 

=DV 

1.20 

1.95 

2.50 

^DL 

2.45 

3.93 

4.93 

Bi(6p^  — 6p^7s) 

3067.x 

2228  X 

2062  X 

\ 

^CB 

9.97 

0.076 

2.60 

^DV 

3.82 

1.55 

2.27 

^DL 

6.75 

2.64 

3.61 

4.82 

4.  79 

. 343 

Table  I reveals  that  the  Corliss- Bozman  line  strength  for  the  2228  S. 


’3/2 


- ’P 


3/2 


transition  in  Bi  is  surprisingly  small  on  the  basis  of  the  following  com- 


parisons. First,  Holmgren's  intermediate  coupling  line  strengths  for  the 

transitions  in  As,  Sb  and  Bi  are  comparable  (within  a factor  of  2.5)  with  the 


4^3/2  , 

S3/2-  P 


1/2 


line  strengths  regardless  of  whether  dipole  velocity  or  dipole  length 


approximations  are  used.  Second,  the  intermediate  coupling  line  strengths  of  Kunisz 

4 4 4 4 

and  Mizdalek  for  the  ^1/2  ®3/2  ^3/2  nearly 

identical.  Furthermore,  Corliss  and  Bozman's  line  strengths  for  7 of  the  8 transitions 
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observed  in  As,  Sb  and  Bi  (i.e. , all  except  for  the  2228  K Bi  transition)  are  within  a 
factor  of  3 of  the  intermediate  coupling  line  strengths  of  Holmgren  using  either  the  di- 
pole velocity  or  dipole  length  approximations.  The  Corliss- Bozman  line  strength  of  the 
Q 4 4 

2228  A 32^2  ““  ^3/2  least  20  times  smaller  than  the  intermediate 

coupling  line  strengths.  All  indications  are,  therefore,  that  this  Corliss- Bozman  value 
is  anomalously  small. 


The  spectroscopic  evidence,  both  in  terms  of  calculations  for  Bi  as  well  as  cal- 
culations and  measurements  for  the  analogous  elements  As  and  Sb,  indicates  that  the 
2^2  — "P  is  a strong  transition.  Significant  electron 


6p^  '^S 


excitation  of  the  6p^7s  ^^3^2  may  be  expected,  therefore,  during  a pulsed  dis- 

charge in  bismuth  vapor.  This  expectation  is  supported  by  the  energy  loss  spectrum 
for  40  eV  electrons  incident  upon  a bismuth  vapor  beam,  which  has  been  measured  by 
Williams,  Trajmar  and  Bozinis^^  (WTB)  and  is  displayed  in  Figure  2. 


Fig.  2.  Energy  loss  spectrum  for  40  eV  electrons  in- 
cident on  a bismuth  vapor  beam  measured  by 
Williams,  Trajmar  and  Bozinis.  This  figure 
is  Fig.  1 in  Reference  11. 


2 4 

The  first  large  peak  in  Fig.  2 must  correspond  to  excitation  of  the  6p  7s  ^^^2 
resonance  level  at  4.04eV  and  direct  excitation  of  the  proposed  upper  laser  level.  The 
other  possibility,  the  6p^  ^P2^2  level  at  4.  11  eV,  is  part  of  the  ground  level  configuration. 
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and  direct  excitation  should  be  very  small  as  indicated  by  the  weak  strengths  of  the 

other  6p  levels  at  1.42,  1.91  and  2.69eV.  The  second  large  peak  must  correspond  to 
2 4 2 2 

excitation  of  the  6p  7s  5 . 56  eV.  The  6p  6d  D, /,  ^ /->  levels  at  5 . 44  and 


34. 


3/2. 5/2 


5.56  eV  are  not  as  strongly  coupled  to  the  6p  Sro'iiid  level  and  are,  therefore. 


not  expected  to  account  for  the  more  than  a small  portion  of  the  second  large  peak. 


Finally  where  the  energy  loss  curve  in  Fig.  2 ends,  it  is  very  likely  beginning  to  rise 

2 4 

to  a third  strong  peak  at  6.01  eV  which  corresponds  to  the  6p  7s  P5/2  fine- 

structure  member  of  the  bismuth  resonance  level. 


The  WTB  energy  loss  spectrum  in  Fig.  2 confirms  the  spectroscopic-based  con- 


clusions of  preceding  paragraphs  based  on  Table  I.  During  a pulsed  discharge,  there- 
2 4 

fore,  the  6p  7s  1^3/2  may  be  expected  to  be  populated  at  a rate  comparable  to 

that  of  the  ^P,  /,  level.  As  indicated  in  Fig.  1 and  Table  II  spontaneous  radiation  at 


1/2 


2989 -X  is  expected  to  be  very  strong,  stronger  than  the  47ZZR.  spontaneous  radiation, 
and  therefore  can  seriously  interfere  with  the  establishment  of  the  proposed  population 
inversion.  Although  the  several  gA  values  given  in  Table  II  for  the  2989  and  2697  ^ 
lines  are  not  as  consistent  as  they  are  for  the  47ZZK  line  (variations  of  57  and  40  com- 


pared with  2),  still  they  suggest  that  spontaneous  radiation  of  these  first  two  lines  will 

2 

be  much  more  effective  in  populating  the  proposed  lower  laser  level  than  will 


spontaneous  radiation  at  4722  X. 


TABLE  II.  Comparison  of  spontaneous  radiation  rates  into  the  ^^/Z 


posed  lower  laser  level  from  the  Pj/2  3/2  5/2 
levels.  Corliss- Bozman's  values  from  Reference  6. 

gAj^y  and  are  Holmgren's  values  from  Ref.  9 for  dipole 

velocity  and  dipole  length  approximations,  Kunisz 

and  Migdalek's  values  from  Reference  10.  (The  tabulated  values 


are  in  units  of  10  photons/sec.) 


^1/2  °3/2 

4722  .S 

(10®/ sec) 

^3/2  **  °3/2 
2989  X 

(10®/ sec) 

^^5/2  - ^I>3/2 
2697  .8 

(10®/ sec) 

8-^CB 

.18 

17. 

4.0 

gAov 

.132 

1.16 

. 10 

^•^DL 

.099 

2.72 

.41 

gA 


KM 


ii 


.091 


.30 


1.57 
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The  most  likely  explanation  for  the  ab6ence*of  laser  action  at  4722^  in  bismuth 
vapor  appears  to  be  that  the  Corliss- Bozman  transition  probability  for  the  2226^ 

P 2^2  lii^®  is  too  low  by  at  least  a factor  of  20.  Strong  population  of  the 

2 4 

6p  7s  ^-^^2  level  during  a pulsed  discharge  and  strong  spontaneous  radiation 

at  2989  R then  could  fill  the  rnetastable  level  and  prevent  the  build-up  of  a 

sufficient  population  inversion  on  the  4722  K line.  Computer  modeling  studies  are 
continuing  to  examine  this  hypothesis.  Such  computer  modeling  will  also  be  helpful  in 
evaluating  the  possibilities  of  similar  potentially- efficient  laser  transitions  in  other 
atomic  vapors. 
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ARPA  under  Office  of  Naval  Research 
N00014-67-A-0438-0017 
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After  submission  of  this  report,  weak  stimulated  emission  at  4722^  was  reported 
by  S.  Markova,  G.G.  Petrash  and  V.M.  Cherezov  [Sov.  J.  Quantum.  Electron.  657  ; 

(1977)].  Stimulated  emission  was  observed  only  as  a thin  ring  around  the  o.d.  of  the 
discharge  tube;  it  did  not  fill  the  tube's  cross  section.  Maximum  average-power  out-  • 

put  was  17  mW  which  is  far  less  than  the  2 W achieved  at  6278  K in  gold  vapor  or  46  W 
achieved  at  5105  A in  copper  vapor  by  Petrash's  group  in  similar  discharge-heated 
apparatus.  The  explanation  presented  in  this  report  involving  competitive  processes  ■ 

in  the  Bi  atom  still  appears  to  be  the  most  likely  explanation  to  account  for  the  difficulty  j 

in  obtaining  laser  action  on  the  4722 .X  Bi  line  and  for  the  very  weak  output  power  com-  i 

pared  with  other  members  of  the  class  of  efficient,  pulsed  atomic-vapor  lasers.  | 
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RATIONAL  APPROXIMATION  OF  TEMPERATURE-DEPENDENT  KINETIC 
COEFFICIENTS  FOR  EFFICIENT  COMPUTER  MODELING 

G.  M.  Kull,  N.  Solimene  and  W,  T.  Walter 


The  modeling  of  a laser  plasma  discharge  requires  the  solution  of  a set  of 
coupled,  non-linear,  stiff  differential  equations  representing  the  population  densities 
and  temperatures  of  the  plasma  components.  ^ These  equations  are  coupled  through 
collision  processes,  such  as  reaction  rates  and  collision  frequencies,  which  are  tem- 
perature dependent.  Because  this  system  of  equations  is  stiff  and  non-linear,  an  it- 
erative implicit  method  of  solution  is  used.  Due  to  the  iterative  nature  of  the  solution, 
the  coupling  coefficients  must  be  calculated  a ntimber  of  times  at  each  time  step. 
Therefore,  an  efficient  method  for  representing  these  coefficients  is  required. 

The  first  step  in  efficient  approximation  is  to  obtain  an  approximating  function 

which  will  have  the  same  behavior  as  the  original  function.  The  most  common  types 

of  approximating  functions  are:  polynomial  series,  rational  function,  trigonometric  sum, 

2 

spline,  and  special  function  approximations.  The  slow  speed  reqxiired  for  the  evalua- 
tion of  trigonometric  and  special  (i.e.  exp,  log,  etc.)  functions  eliminates  these  from 
prime  consideration.  Also,  the  spline  method  will  not  be  considered,  due  to  its  in- 
efficient requirement  that  the  region  of  interest  be  subdivided  into  a set  of  piecewise 
polynomial  functions.  Calculations  of  the  polynomial  approximation  is  efficient,  but 
an  inherent  problem  with  this  method  is  its  inability  to  approximate  sharp  bends  with- 
in a relatively  smooth  region.  The  rational  function  approximation  does  not  have  any 
of  the  above  problems.  Being  a ratio  of  two  polynomials,  it  can  be  efficiently  calcu- 
lated and  can  take  on  the  behavior  of  the  function  where  regular  polynomial  approxi- 
mations fail.  Thus,  the  rational  function  approximation  is  the  logical  choice.  Deter- 
mination of  the  coefficients  for  the  terms  of  a rational  approximation  is  the  difficult 
part  of  this  method.  The  procedure  which  has  been  developed  to  optimize  the  choice 
of  term  coefficients  to  match  collision  and  reaction  rates  over  a broad  energy  range 
will  now  be  described. 


The  collision  frequency  and  reaction  rate  have  been  observed  to  have  the  follow- 
ing functional  dependence  on  temperature: 

< V > a f(T^/^) 

K(T)  a g(T‘/^)e‘^/^'^  . 

The  reaction  rate  has  an  additional  exponential  factor  dependent  on  the  activation  en- 
ergy, E*.  This  leads  to  the  following  choice  of  a rational  function  approximation  to 
and  g(T^/^) 
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R(T) 


P{T) 

oTry 


Po  + Pi  T 


P2T  + P3T 


3/2+;..+ 


P T‘ 


V2 


The  boundary  conditions  for  f(T^/^)  and  g(T^/^)  are  such  that  as  T goes  to  zero,  £ and 
g go  to  zero,  thus  Pq  = 0.  Also,  as  T goes  to  infinity  f and  g also  go  to  zero.  Thus 
the  order,  n,  of  the  polynomial  in  the  denominator  must  be  greater  than  the  order,  m, 
of  the  polynomial  in  the  numerator. 


As  mentioned  before,  the  major  problem  with  the  rational  approximation  is  that 

3 

the  p and  q coefficients  are  not  straightforward  to  calculate.  The  first  step  in  obtain- 
ing these  coefficients  is  to  determine  the  ratio  of  n to  m.  This  is  done  by  obtaining  the 
temperature  dependence  of  f and  g as  T goes  to  infinity.  Once  the  ratio  of  n to  m is 
known,  the  lowest  order  of  polynomials  can  be  used  in  P and  Q to  give  the  desired  fit. 

This  still  leaves  the  problem  of  obtaining  the  p and  q values.  To  do  this,  the  maximum 

4 

error  is  minimized  over  a given  discrete  set  of  points. 


r = |F(T.)  - R(T.)1 

where  F(T)  is  either  f(T^/^)  or  g(T^/^).  We  must  also  require  Q(T)  to  be  greater  than 
zero  to  prevent  a singularity  in  the  region  of  interest.  Because  this  method  is  non- 
linear, an  interpolation  method  can  be  used.  Expanding  the  error  equation  about  a given 
iteration  point,  k,  we  obtain  the  following  constraints. 


[F(T)t  - It  * ' 


[Fm- 


where  6Q  = Q - Qj^i  6P  = P - and  r^^  = F{Tj^)  - R{Tj^).  Also,  for  this  Taylor  expan- 
sion to  be  valid  we  must  add  the  following  constraints, 

l6qj  < 1 i = 0,  1,  . . . , n 

l6pj  < 1 i = 0,  1 m 

The  above  set  of  constraints  are  solved  by  linear  programming  techniques  for  6q  and  6p 
to  minimize  r.  If  r is  not  the  best  Tchebycheff  approximation,  the  iteration  is  con- 
tinued using  the  new  values  of  P and  Q as  the  old  values  Pj^  and  Qj^. 

A computer  prc.^*am  has  been  written  which  implements  the  algorithm  to  obtain 
the  best  rational  approximation.  As  a demonstration  of  its  ability  we  have  obtained  the 
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lowest  order  rational  approximation  to  the  argon  momentum -transfer  rate.  Figure  1 | 

4 ! 

shows  a comparison  with  the  e::perimental  collision  rate.  The  agreement  is  quite 


ENERGY  (EV) 

Fig.  1 Argon  momentum -transfer  rate. 

good.  Term  coefficients  (p^^  and  for  the  argon  momentum -transfer  rate  rational 
approximation  are  given  in  Table  I. 

TABLE  I.  Term  coefficients  for  rational  function  approximation 
of  the  argon  momentum -transfer  rate. 


Po=  0.0 

^10  = 

9.  379518  X 10"^ 

Pj  = 9.  118165  X 10'^ 

'll  = 

1. 00 

p^  = -5, 579821  X 10'® 

^2  = 

-1,375039  X 10’^ 

p^  = 8. 954906  X 10’® 

II  II 

O'  IT 

-3.276637  x 10’^ 

9. 134490  X 10'^ 

Comparison  of  the  computer  time  required  to  calculate  the  momentiun -transfer 
collision  rate  using  the  rational  approximation  with  the  time  taken  for  the  calculation 
using  an  integration  over  the  experimental  data  points  or  a partial  wave  representation 
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indicates  that  the  rational  function  approximation  calculation  is  about  ten  times  faster. 

We  have  shown  that  the  use  of  rational  approximations  for  reaction  rates  and  collision 

frequencies  is  an  efficient  method  of  representation  over  the  broad  energy  region  of 

interest. 

Joint  Services  Technical  Advisory  Committee 
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THE  CHARACTERIZATION  OF  NOISE  SOURCES  BY  ANGULAR  CORRELATION 
MEASUREMENTS 

D.B.  Scarl  and  S.  McAfee 

Second  order  angular  field  correlation  measurements  can  provide  information 
about  the  shape  and  intensity  of  a distant  noise  source  or  scatterer.  These  measure- 
ments, performed  by  averaging  the  product  of  the  intensities  at  two  field  points,  are 
phase  independent  and  therefore  are  very  little  degraded  by  fluctuations  of  the  medium 
in  which  the  noise  field  is  propagating.  Acoustic  and  electromagnetic  noise  sources  in 
the  microwave  and  optical  regions  can  be  characterized  by  second  order  measurements 
made  by  an  array  of  fixed  detectors.  In  contrast  to  imaging  and  to  first  order  inter- 
ferometry which  depend  on  the  addition  of  field  amplitudes  and  are  therefore  strongly 
phase  sensitive,  second  order  or  intensity  correlations  are  sensitive  only  to  propaga- 
tion delays  or  fluctuations  that  are  comparable  with  the  correlation  time  of  the  detect- 
ed signal.  This  correlation  time  can  be,  in  the  case  of  optical  sources,  orders  of 
magnitude  greater  than  the  period  of  the  propagating  waves. 

Some  of  the  features  of  second  order  correlation  measurements  can  be  outlined 
by  considering  a simple  two-dimensional  wave  propagation  problem.  If  a source  lying 
along  the  x axis  radiates  into  the  half-plane  z > 0,  the  radiation  pattern  depends  on 
A{x),  the  source  amplitude  distribution.  A completely  coherent  source  such  as  an  an- 
tenna or  an  acoustic  transducer  radiates  an  angular  pattern  that,  in  the  far  field,  is  the 
Fourier  transform  of  A{x),  with  X=kx  playing  the  part  of  the  transform  variable  t and 
0,  the  angle  between  the  z axis  and  the  field  point,  playing  the  part  of  w . For  example, 
a source  extending  from  +|  to  -f  and  radiating  with  uniform  amplitude  and  phase  gen- 
erates an  angular  amplitude  distribution  proportional  to  sin(L0)/L0.  (Here,  L = kf  . ) 

If  the  same  source  radiates  incoherently  with  the  same  intensity  distribution  as 
before,  the  angular  intensity  distribution  in  the  far  field  is  very  different,  falling  off 
with  cos(0)  but  providing  almost  no  information  about  the  size  or  excitation  of  the  source. 
However,  for  this  coherent  source,  the  first  order  angular  correlation  function  in  the 
far  field  is  proportional  to  the  Fourier  transform  of  the  intensity  distribution  across 
the  source.  For  an  incoherent  source  extending  from  +f  to  -f  the  correlation  function 
is  proportional  to  sin(LA0)/LA0,  where  A0  is  the  angular  separation  between  detectors. 
The  correlation  function,  depending  only  on  the  angle  between  the  detectors  and  not  on 
the  angle  between  the  detectors  and  the  z axis,  gives  almost  no  information  about  the 
location  of  the  source,  but  does  give  information  about  its  size  and  intensity  distribu- 
tion. The  Fourier  transform  relationship  between  source  intensity  and  field  angular 
correlation  is  independent  of  phase  variations  at  or  near  the  source,  but  is  affected  by 
phase  differences  arising  from  index  of  refraction  fluctuations  or  motion  at  or  near  the 
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detectors.  A characterization  of  fields  in  terms  of  angular  correlation  functions  plays 
an  important  part  in  Wolf's  new  theory  of  radiative  energy  trans'':*r  in  electromagnetic 


fields 

If  the  second  order  correlation  function  of  an  incoherently  radiating  or  scattering 
source  is  measured,  the  part  of  the  correlation  function  that  is  dependent  on  A0  is  pro- 
portional to  the  square  of  the  Fourier  transform  of  the  intensity  distribution  across  the 
source.  A linear  source  extending  from  +i  to  -f  would  create  a second  order  angular 
coherence  pattern  proportional  to  1 + S (sin(LA0/LA0)  , where  S is  a factor  having  to 
do  with  signal  to  noise  ratios  and  detector  properties.  Now,  however,  the  signal  is 
independent  of  phase  changes  from  turbulence  or  motion  either  near  the  source  or  near 
the  detectors,  as  long  as  those  changes  are  less  than  the  coherence  time  of  the  signal. 
The  coherence  time  is  set  by  the  source  spectrum  and  is  often  much  greater  than  the 
period  of  the  central  frequency  component.  Second  order  correlation  measurements 
have  been  used  by  Hanbury  Brown  to  measure  the  angular  diameter  of  stars  with  an  in- 
tensity interferometer  in  which  the  detector  separation  can  be  as  large  as  200  meters. 
Although  the  correlation  pattern  from  stars  has  a very  large  spatial  extent  on  the  sur- 
face of  the  earth,  the  angle  this  correlation  pattern  subtends  at  the  star  is  less  than 
10  radians. 

Our  recently  completed  measurement  of  the  second  order  correlation  from  a 
laboratory  source  of  small  extent  has  detected  a correlation  pattern  that  extends  over 
an  angle  of  about  0.  3 radians  as  viewed  from  the  source.  This  measurement  was  dene 
with  two  detectors  whose  large  sensitive  areas  overlapped  completely.  We  have  now 
developed  a new  brighter  chaotic  source  which  will  allow  us  to  measure  large  angle 
correlations  with  two  small  area  non- overlapping  detectors.  These  new  measurements 
should  show  that  large  angle  second  order  correlation  measurements  are  able  to  give 
information  about  the  intensity  distribution  of  sources  or  scatterers  whose  size  is  com- 
parable with  the  central  wavelength  of  the  radiated  or  scattered  noise  signal. 
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A CHALLENGE  FOR  NONLINEAR  SPECTROSCOPY 
S.  R.  Barone 

The  resolution  of  saturation  spectroscopy  is  now  sufficient  to  observe  the  re- 
coil shift  between  the  emission  and  absorption  lines  of  light  molecules.  ^ The  numeri- 
cal value  of  the  recoil  shift  is  determined  by  energy-momentum  conservation  and  is 
in  good  agreement  with  the  experimental  value.  The  understanding  of  future  experi- 
mental results  on,  for  example,  the  pressure  and  power  dependence  of  line  shapes, 
requires  a dynamical  theory  that  contains  assumptions  beyond  the  conservation  laws. 
One  group  of  theories  is  partly  based  on  the  assvimption  that  the  Hamiltonian 

2 j 

^o  = &r  + 2'^"o  ""3  ■ ""l  ' (») 

is  appropriate  for  the  description  of  the  orbital  motion  of  a molecule  in  an  externally 
applied,  linearly  polarized  field,  E(z,  t).  The  purpose  of  this  report  is  to  point  out; 

1)  the  above  Hamiltonian  contains  a force  law  which  has  no  analog  in  classical  electro- 
dynamics, 2)  a modified  version  of  the  above  Hamiltonian  is  consistent  with  classical 
force  laws,  3)  experimental  results  sufficiently  precise  to  establish  the  correct  elec- 
tromagnetic force  on  a molecular  dipole  and  the  corresponding  Hamiltonian  would  have 
significant  implications . 

1)  With  regard  to  the  first  point  notice  that  the  Heisenberg  equations  for  the 
orbital  motion  of  a molecule  which  follow  from  the  Hamiltonian  Eq.  (1)  and  the  usual 
commutation  rules  are 

^ ^ = ^z  ^ '^l  ‘ ' (2) 


or 


[X  CTj  E(z,  t) 


(3) 


According  to  the  usual  interpretation  of  Heisenberg  equations  the  force  causing  orbital 
acceleration  of  the  molecule  is  proportional  to  the  axial  gradient  of  the  electric  field 
at  the  location  of  the  molecule.  This  is  inconsistent  with  classical  electrodynamics. 
Acccrdifig  to  classical  electrodynamics  the  force  on  an  electric  dipole  Cfi)  is  (Gaussian 
units) 


classical 


VE  + i xB 

c dt 


(4) 


The  notation  is  standard.  The  molecular  mass,  rest  frame  level  separation,  dipole 
moment,  position  and  momentum  are  M,  "iTwo,  |J,  z and  p,  respectively.  The  three - 
vector  (0i/2,  O2/2,  03/2)  satisfies  spin  - l/2  commutation  rules,  z,  p satisfy  canon- 
ical commutation  rules  and  commute  with  the  a's. 
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I 


wher«  c is  the  speed  of  light.  The  second  term  in  the  above  expression  is  the  analog 
for  a dipole,  of  the  Lorentz  force  on  a charge.  In  a plane  wave,  for  example,  the 
dipole  moment  oscillates  at  the  same  frequency  as  the  field  vectors  and  the  two  terms 


in  F , , are  the  same  order  of  magnitude.  Using  Faraday's  Law  Eq.  (4)  may  be 

C x&S  SI  Cot X 


written 


(5) 


To  within  velocity  dependent  forces  which  are  negligible  for  slowly  moving  molecules, 
the  above  result  differs  from  the  right  hand  side  of  Eq.  (3)  by  the  time  derivative  of 
li  X B/c.  For  a molecule  in  a radiation  field  the  two  terms  in  Eq.  (5)  are  the  same 
order  of  magnitude.  It  is  instructive  to  notice  further  that  the  interaction  Hamiltonian 
-]i-E  leads  to  a translational  force 


F^  = V(M  • E) 


(6) 


which  again  consists  of  two  terms  of  the  same  order  of  magnitude.  Details  of  spectro- 
scopic experiments  which  are  sensitive  to  the  force  law  and  not  only  the  conservation 
laws*  can  distinguish  between  Equation  (4)  and  (16). 

2)  The  linear  momentum  -p  x ^c  which  accounts  for  the  difference  between  Eqs. 
(4)  and  (6)  and  is  made  explicit  in  Eq.  (5)  is  ordinarily  orders  of  magnitude  smaller 
than  a transition  photon's  worth  of  linear  momentum.  Similarly,  the  dipole  energy 
-p  • E is  ordinarily  orders  of  magnitude  smaller  than  a transition  photon's  worth  of 
energy.  Nevertheless  because  of  their  rapid  variation  both  terms  are  important  in 
determining  the  orbital  motion  of  a molecule.  Physically  the  energy  -p  • E is  the 
energy  of  the  induced  electric  dipole  in  the  prescribed  electric  field.  Similarly 
-p  X ^c  is  the  linear  momentvim  to  be  associated  with  a stationary  electric  dipole  in 
a magnetic  field.  This  momentum  can  be  regarded  as  residing  in  the  field  viz.  The 
volume  integral  of  the  cross  product  of  the  electric  field  due  to  the  induced  dipole 
and  the  external  magnetic  field.  The  total  translational  force  on  a molecule  is  the 


* We  are  assuming  that  theories  based  on  (4)  or  (16)  can  be  elaborated  in  such  a 
way  as  to  contain  the  conservation  laws.  Whether  or  not  this  can  be  accomplished 
in  a reasonable  way  requires  further  discussion. 


The  physical  interpretation  of  the  p x B term  and  the  analogous  term  for  a magnetic 
dipole  has  been  discussed  briefly  in  S.  Barone,  Phys.  Rev.  D 12,  3363  (1975)  and 
S.  Barone  and  M.  Narcowich,  Phys,  Rev.  D,  11,  2880  (1975).  These  papers  con- 
tain references  to  the  earlier  literature  on  this  subject.  H.  Haus  (private  commu- 
nication) has  produced  a macroscopi^model  which  verifies  (for  permant  magnetic 
dipoles)  that  the  term  analogous  to  - p x B/c  is  equal  to  the  linear  momentum  in 
the  field.  For  details  see  P.Melman,  Ph.  D.  Thesis,  "Orbital  Motion  of  Dipoles 
in  Electromagnetic  Fields,"  Polytech.  Inst,  of  New  York  (1976)  Appendix  A. 
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force  required  to  change  its  orbital  momentum  (mass  time  orbital  velocity)  plus  the 
force  required  to  change  the  linear  momentum  of  the  field  due  to  the  presence  of  the 
dipole.  This  suggests  writing  the  classical  equations  of  motion  in  the  form 
(v  = dF/dt  = orbital  velocity) 

^ (Mv  - Tlx  B)  = V(U  . E)  . (7) 

Except  for  nearly  stationary  molecules  x 1/c  is  small  compared  to  Mv.  Neverthe- 
less the  time  rate  of  change  of  fl  x 1/c  is  comparable  to  the  time  rate  of  cha-  -f 
Mv  and  both  of  these  quantities  are  comparable  to  the  gradient  of  p -E. 

In  order  to  construct  a quantized,  Hamiltonian  description  it  is  convenient  to 

define 


TT=P  + |-MxB,  (8) 

where p is  the  canonical  momentum  conjugate  to  r , the  orbital  location  of  the  molecule. 
The  Hamiltonian 


^ " 2M  ^internal 


and  the  usual  commutation  rules  yield  the  Heisenberg  equations 


TT  - M dt 


(9) 


(10) 


dTT  _ — r-rt? 


dr  = 


VE  + 


^ X B + term 


s proportional  to  TT 


(11) 


thus  according  to  the  Hamiltonian  Eq.  (9)  TT  is  to  be  identified  as  the  orbital  or  kinetic 
momentum.  TT  is  the  molecular  mass  times  its  orbital  velocity  (Eq.  (10))  and  its  time 
rate  of  change  is  the  classical  force  on  an  electric  dipole  (Eq.  (11)).  pis  the  canonical 
or  total  momentum  (orbital  + field).  The  Hamiltonian  consists  of  internal  energy, 
kinetic  energy  and  dipole  energy.  The  cross -terms  (p-  ]IxB+TIxB  - p)/2Mc  which 
appear  in  Eq.  (9)  but  not  in  Eq.  (1)  are  the  electric  analog  of  spin-orbit  terms  in  the 
theory  of  electronic  energy  levels.  The  entire  Hamiltonian  Eq.  (9)  may  be  written 

^ ^ ^internal  ' ® ‘ ^ proportional  to  , (12) 


where  s |i  x TT/Mc  is  the  magnetic  moment  to  be  associated  with  a moving  electric 
moment.  In  the  present  situation  the  effect  of  the  • B term  on  energy  levels  is 
negligible.  This  term  is  necessary  however  in  order  to  obtain  Heisenberg  equations 
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of  the  same  form  as  the  classical  equations  of  motion. 

3)  The  theoretical  arguments  in  favor  of  Eq.  (9)  are  based  on  the  correspondence 
principle.  But  is  the  correspondence  principle  sufficiently  well  verified  in  this  arena 
that  it  can  be  applied  without  question?  One  can  also  question  the  extent  to  which  the 
quantum  theory  really  has  been  made  to  correspond  to  the  classical  theory.  The  force 
laws  are  the  same.  But  there  is  no  generally  accepted  canonical  formalism  for  clas- 
sical systems  with  a dipole  moment.  The  total  energy  in  Eq.  (9)  consists  of  internal 
energy,  kinetic  energy  and  electric  dipole  energy.  However  on  eliminating  the  velocity 
in  favor  of  the  canonical  momentum  (as  in  Eq.  (12))  a magnetic  dipole  energy  appears. 
According  to  Eq.  (9)  the  magnetic  dipole  energy  is  really  part  of  the  kinetic  energy. 

5}e 

This  might  be  interpreted  as  contrary  to  classical  electrodynamics  . If  the  force  law  is 
accepted  as  primary  the  energy  appears  to  be  given  by  Equation  (9).  If  the  energy  ex- 
pression Eq.  (1)  is  accepted  as  primary  the  force  law  is  non-classical. 

It  might  be  argued  that  a non-classical  force  law  for  molecular  dipole  moments 
is  really  not  in  conflict  with  the  correspondence  principle  because  in  the  classical 
limit  (tl-*  0)  dipole  moments  vanish.  Molecular  dipole  moments  are  quantum  mechani- 
cal entities  and  so  may  be  subject  to  a non-classical  force  law.  Alternatively,  quan- 
tum effects  at  the  molecular  level  (proportional  to  H)  may  contribute  to  the  effective 


*There  is  however  an  analogy  to  the  appearance  of  the  magnetic  moment  of  a Dirac 
electron  as  a consequence  of  Zitterbewgung  of  its  charge. 
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force  in  such  a way  as  to  give  it  a non-classical  form. 

Should  £q.  (9)  turn  out  to  be  correct  Eq.  (8)  carries  the  implication  that  per- 
pendicular velocity  components  do  not  commute  and  hence  cannot  be  simultaneously 

2 

determined  with  unlimited  precision.  To  within  terms  of  order  l/c 


c ^ 

4 r. 


TT 


VB. 


(13) 


It  is  well  known  that  perpendicular  velocity  components  of  charged  systems  do  not 
commute.  According  to  Eq.  (9)  there  is  an  analogous  situation  for  neutral  systems 
with  a dipole  moment.  On  the  other  hand  according  to  Eq.  (1),  velocity  components 
commute  with  each  other. 
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For  a bovind  state  of  two  oppositely  charged,  spinless  particles,  the  Heisenberg 
equations  for  the  individual  particles  are 


d^r 


dt 

d^r.. 


^=-VjCp(lrj-r2l)  + q|E  (r  ^ , t)  + -^  x B(r  j , t)  J 


m^— ^ = - V2'P(lri  - r2l)  - q[E(r2.  t)  + -^  x BCr^,  t)  ] 
dt 

where  cpis  the  static  interparticle  potential  and  E,  B describe  an  externally  imposed 
classical, radiation  field.  The  time  derivative  of  the  total  mechanical  momentum  is 


dt 


(mj  rj  + m^  r^)  = q |_E(rj,  t)  - E(r2,  t)  J 


rv. r 

+ q X B (f  j,  t)  - X B(r2,  t)  ! 


so  that  in  the  dipole  limit 

= U • VE(R,  t)  + i xB(R,  t) 
dt 

where  M + m2»  R = (m^,  Fj  + m2  r2)/M  is  the  usual  center  of  mass  coordinate 

and  H = q(r^  - r2)  is  the  electric  dipole  moment  of  the  system.  Thus  for  this  model 
of  a quantxim  system  the  time  derivative  of  the  total  mechanical  momentum  of  the 
system  is  given  by  the  classical  force  law  of  Eq.  (4).  This  result  is  independent  of 
the  number  of  particles  involved  in  the  bound  state.  The  above  equation  however 
must  be  solved  subject  to  the  constraint  [R,  dR/dt]=  iH/M.  This  can  introduce 
quantxun  effects  proportional  tolrinto  the  time  evolution  of  R. 
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A SOLUTION  OF  THE  X-RAY  "PHASE  PROBLEM" 

B.  Post 

The  direct  determination  of  the  phases  of  x-ray  reflections  from  single  crystal 
intensity  data  constitutes  one  of  the  oldest  and  most  challenging  problems  of  x-ray 
physics.  It  is  of  particular  interest  to  crystallographer s;  its  solution  could  make 
possible  the  determination  of  the  crystal  structures  of  all  substances  from  which 
suitable  single  crystal  data  can  be  obtained. 

It  has  been  suspected  for  some  time  that  coherent  interactions  among  diffracted 
beams  which  take  place  when  several  sets  of  planes  diffract  simultaneously,  may  pr<^- 
vide  clues  to  the  solution  of  the  "phase  problem.  " Lipscomb,^  Fankuchen,  Eckstein, 

Miyake  and  Kambe,^  Hart  and  Lang,^  ana  others  have  investigated  the  problem  from 
that  point  of  view,  with  only  limited  success. 

In  general,  the  phases  of  individual  reflections  vary  with  choice  of  the  unit  cell 
origin.  The  phases  of  the  products  of  groups  of  structure  factors,  the  sums  of  whose 
indices  equal  zero,  are  however,  invariant  to  choice  of  origin,  and  their  determination 
has  physical  significance.  Our  discussion  is  limited  to  such  phase  products. 

I A procedure  for  generating  n-beams  simultaneous  diffraction  is  illustrated 

schematically  in  Figure  1.  When  reciprocal  latice  point  (rip)  H is  brought  to  its 
I diffracting  position  on  the  surface  of  the  Ewald  sphere,  conventional  two-beam  dif- 

i fraction  takes  place,  and  diffracted  beams  are  directed  to  O and  H.  When  the 

I crystal  is  rotated  about  OH  without  disturbing  the  setting  of  H,  additional  rip's  are 

brought  to  their  diffracting  positions  and  n-beam  diffraction  (n  > 2)  occurs.  Figure  2 
shows  that  under  such  conditions  parallel,  overlapping,  coherent  beams  are 
simultaneously  directed  to  the  rip's  in  diffracting  positions.  It  is  evident  that 
} simultaneous  n-beam  diffraction  can  provide  the  necessary  conditions  for  interference 

^ among  discrete  coherent  beams  in  simple,  controllable  form. 

[ Experimental  intensities  along  two-beam  diffraction  lines,  in  the  immediate 

vicinity  of  three-beam  diffraction,  are  shown  in  Figs.  3(a)  and  (b)  for  triplet  phase 
products  of  i 1 respectively.  The  techniques  used  to  record  the  photographs  have 
been  described  previously.  ^ The  specimen  investigated  was  a single  crystal  of 
aluminum  oxide  (corundum).  Its  structure  factor  phases  are  well  known.  The  in- 
dices, phases,  and  magnitudes  of  the  structure  factors  involved  in  the  three-beam 
interactions  are  listed  in  the  Figure  captions. 

The  distinctive  feature  of  Fig.  3 is  that  in  (a)  the  intensity  is  essentially  sym- 
metric, and  in  (b)  it  is  asymmetric  about  the  three-beam  direction.  We  show  below 

i 
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Fig.  1.  Schematic  representation  of  three-beam  simultaneous 
diffraction  involving  rip's),  H,  and  P. 


Fig.  2.  Discrete  diffracted  beams  directed  to  rip  H,  within  the  crystal. 

that  this  is  a general  characteristic  of  the  intensity  distributions  for  the  two  opposite 
phase  products,  provided  that  all  three  structure  factors  are  non-zero.  The  observed 
symmetry  differences  are  implicit  in  the  dynamical  theory  of  diffraction.  ^ They  make 
available  a tool  for  the  experimental  determination  of  structure  factor  phase  products. 
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012  -47 

104  -81 

(b) 

Fig.  3.  Intensity  distributions  along  two-beam  CaKa^,  and  Ka,  lines 
near  three-beam  points  in  Al^O^,  for:  a)  positive  and 
negative  phase  products. 

The  x-ray  wave-fields  in  n-beam  diffraction  obey  Maxwell  s equations  for  a 
medium  with  a complex,  periodic  dielectric  constant,  under  conditions  which  satisfy 
Bragg's  Law. 

The  amplitudes  are  solutions  of  the  linear  homogeneous  equations: 


DL  * (2e„) 


c defined  by 


1^1 


1 


X vac. 

_ 1 is  an  unknown  to  be 

X crystal 


chosen  such  that  Eq.  (1)  has  a non-trivial  solution.  4)  (jj.pj  ^ coefficient  in  the 
expansion  of  the  electric  susceptibility  in  a Fourier  series.  It  is  proportional  to  the 
negative  of  the  structure  factor,  is  the  vector  component  of  D^ 

perpendicular  to  . The  summation  is  over  all  rip's  but  is  limited,  in  practice, 
to  terms  for  which  the  € 's  are  very  small,  i.  e.  , to  the  rip's  very  near  their  dif- 
fracting positions. 


For  a three-beam  case,  Eq.  (1)  yields  three  closely  similar  pairs  of  solutions. 
Without  significant  loss  of  generality,  we  will  limit  our  discussion  to  one  member 
of  each  such  pair.  We  will  discuss  centrosymmetric  crystals  whose  origins  are  at 
symmetry  centers.  In  such  crystals,  4)^^  = 4>_jj  and  tl’®  phase  angles  of  the  structure 
factors  are  either  0 or  w . It  is  readily  shown  that  the  conclusions  reached  for  three 
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beams  apply,  with  only  minor  modifications,  to  four-beam  and  higher  order  cases. 

The  allowed  values  of  C solutions  of  the  secular  determinant  of  Equa- 

tion (1).  They  are  strongly  phase  dependent,  as  can  be  seen  from  an  examination  of 
the  expansion  of  the  determinant  at  the  exact  three-beam  point: 


(«t>o-  2^  - < I In  1^)  o>  + 2<)>h‘I>p  <»>(h-P)  = 0 (2) 

Equation  (2)  has  3 unequal  real  roots  summing  to  zero.  The  distribution  of  the  signs 
of  the  roots  is  determined  by  the  sign  of  the  last  term,  i.  e.  , by  the  sign  of  the  in- 
variant phase  of  the  structure  factor  triplet.  The  two  possible  distributions  of  the 
roots  (-  + + or  - - + ) lead  to  propagating  modes  which  differ  significantly  with  respect 
to  excitation  and  absorption. 


The  general  solution  for  the  e near  a three-beam  point  is  usually  represented 
by  surfaces  in  reciprocal  space  defining  the  wave  vector  sets  (Kq’  — h’  — p^’ 
for  each  mode,  (Figure  4).  The  collection  of  surfaces  representing  all  the  modes  is 
referred  to  as  the  "dispersion  surface.  " 


Fig.  4.  Section  through  dispersion  surface  near  three-beam  point. 
Calculated  for  positive  and  negative  phase  products. 


A calculated  section  through  the  dispersion  surface,  parallel  to  a two-beam 
line  passing  through  a three-beam  point  near  the  center,  is  shown  in  Fig.  4,  for  both 
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signs  of  the  triplet  phase.  The  points  La  (Laue)  and  Lo  (Lorentz)  are  the  three-beam 
diffraction  points  for  reciprocal  distances  of  1 ^ 
from  each  of  the  three  rip's. 

In  general,  the  modes  whose  surfaces  are  closest  to  the  Laue  point  have  the 
lowest  absorption  coefficients  and  are  responsible  for  most  of  the  transmitted  in- 
tensities.  The  corresponding  modes  in  Fig.  4 are  1, 2 on  the  left  and  3,  4 on  the 
right.  We  note  that  when  the  triplet  phase  is  positive  (dashed  lines),  the  curves  cor- 
responding to  those  modes  are  approximately  symmetrical  about  the  three-beam 
region  (near  the  center).  For  negative  phase,  however,  the  curves  (solid  lines)  show 
a large  discontinuity  just  to  the  right  of  the  three-beam  region,  and  the  3,  4 curve  on 
the  right  reaches  positions  equivalent  to  those  of  I,  2 on  the  left  only  at  large  angular 
distances  from  the  center. 

The  calculated  effects  of  the  above  on  the  excitation  of  modes  of  propagation 
and  on  the  absorption  coefficients  are  shown  in  Figures  5 and  6.  The  effects  of 
polarization  of  the  diffracted  beams  have  been  taken  into  account  in  these  calculations. 
To  reduce  confusion  we  show  the  two-mode  averages  for  only  two  pairs  of  modes 
(1. 2 and  3.  4)  in  the  Figures.  For  the  positive  phase  product  the  excitations  and  the 
absorption  coefficients  undergo  only  minor  changes  in  passing  through  the  three- 
beam  regions.  The  corresponding  changes  are  much  greater  for  the  negative  phase 


Calculated  for  CuKa^,  for  negative  and  positive  phase  products. 


*4 
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Fig.  6.  Relative  excitations  of  modes  of  propagation.  Calculated 
for  negative  and  positive  phase  products. 


The  intensity  of  a diffracted  beam  is  given  by: 


Ir  = [ 2TriKj^{m)  . _r)  . (exp. -2ir  . r)]^ 


(3) 


i.  e.  , by  the  product  of  terms  representing  excitation  and  absorption.  The  summation 
is  over  the  'm'  modes.  Kj^and  are  the  real  and  imaginary  parts  of  the  propaga- 


tion vector,  Kr-  equals  u^,  the  linear  absorption  coefficient  of  mode  'm. 


-m'  * m 

It  is  evident  from  Eq.  (3)  and  Figs.  5 and  6,  that  positive  and  negative  structure  factor 


phase  triplets  should  yield  different  spatial  distributions  of  diffracted  intensity  in 
n-beam  diffraction.  We  have  observed  such  phase  effects  repeatedly  in  perfect  crys- 
tals of  germanium  and  ammonium  dihydrogen  phosphate  as  well  as  in  the  relatively 
imperfect  crystal  of  aluminum  oxide  discussed  above.  Analysis  of  Eq.  (2)  shows  that 
the  effects  of  a change  of  the  phase  product  on  the  diffraction  process  are  maximized 
when  all  three  structure  factors  are  equal  to  one  another,  and  vanish  if  one  of  the 
structure  factors  equals  zero.  The  phase  effects  should  therefore  be  detected  as 
readily  when  all  three  structure  factors  are  "weak,  " as  when  all  are  "strong.  " 

The  extent  to  which  these  effects  can  be  detected  in  imperfect  crystals,  such  as  are 
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usually  used  for  crystal  structure  analysis,  or  in  non-centrosymmetr ic  crystals, 
remains  to  be  determined. 

Joint  Service  Technical  Advisory  Committee 
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STRAIN  DEPENDENCE  OF  RUST  LAYERS  AT  X-  BAND 
H.  L.  Bertoni  and  L.M.  Silber 

A,  Introduction 

Radar  returns  from  moving  vehicles,  or  stationary  vehicles  with  engine  running, 
are  found  to  exhibit  modulation,  some  of  which  is  periodic  and  some  random  in  nature. 
Possible  sources  of  this  modulation  are  reflections  from  moving  parts,  such  as  an 
engine  fan,  modulation  of  the  radar  induced  current  distribution  as  a result  of  inter- 
mittent contact  between  adjacent  metal  parts,  and  vibration  induced  changes  in  the  sur- 
face conductivity  of  rusty  metal  parts.  The  nature  of  radar  modulation  by  agitated 
metals  (RADAM)  is  of  interest  as  a means  of  detecting  and  identifying  vehicles,  and  in 
order  to  avoid  such  detection  and  identification.  Previous  investigations  have  concen- 
trated on  the  effects  of  reflections  from  moving  parts,  and  on  the  effects  of  intermit- 
tent contacts  ^ 

The  work  described  here  has  focused  on  the  importance  for  the  RADAM  effect  of 
vibrational  strain  induced  changes  in  conductivity  of  rusty  surfaces.  To  examine  the 
effect  under  controlled  conditions,  we  carried  out  laboratory  measurements  at  X-band 
in  a waveguide  system.  In  order  to  enhance  the  sensitivity  of  the  measurements,  a 
reflection  cavity  was  used,  whose  end  wall  was  replaced  by  rusty  iron  foils.  The  re- 
turn loss  of  the  cavity  was  measured  for  the  foils  in  both  strained  and  unstrained  states. 
From  these  measurements  the  effective  surface  resistance,  the  plane  wave  reflection 
coefficient,  and  their  dependence  on  strain,  were  found. 

B.  Cavity  Measurements 

Two  techniques  were  developed  to  measure  the  changes  in  cavity  absorption  due 
to  straining  the  rusty  foil  end-wall  of  X-band  reflection  cavities.  The  first  technique 
was  developed  to  measure  the  extremely  small  changes  in  cavity  Q obtained  with  the 
lightly  rusted  foils  that  were  initially  prepared.  The  foils  were  strained  by  clamping 
the  edges  of  the  foil  to  the  choke  flange  at  the  back  end  of  the  cavity  and  driving  the 
center  of  the  foil  in  and  out  via  an  attached  drive  pin  --  see  Figure  1(b).  With  this 
method  of  straining  the  foil,  the  effective  length,  and  hence  resonant  frequency,  of  the 
cavity  are  modulated,  as  well  as  the  conductivity  of  the  rust  film. 

The  first  measurement  technique  used  this  resonant  frequency  modulation.  The 
drive  pin  was  displaced  periodically  at  a frequency  f^  and  the  modulation  of  the  reflect- 
ed power  at  frequencies  f and  2f  was  measured.  The  fixed  microwave  frequency 
^ * m m 

was  first  set  at  the  resonant  frequency  f^  of  the  unstrained  cavity,  and  then  at  the  half- 
power  frequency  fj^y'2'  ratio  of  modulation  at  2 f^  for  microwave  frequency  f^,  to 

the  modulation  at  f^  for  microwave  frequency  fw2*  determined  and  compared  with 
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(a) 


Fig.  1.  Set-up  for  measuring  electrical  conductivity 
of  rusty  iron  foils:  (a)  waveguide  circuit; 

(b)  cavity  with  foil  end- wall. 

a theory  developed  assuming  no  change  in  conductivity.  For  copper  and  clean  iron  foils 
the  theory  and  experiment  were  in  good  agreement.  However,  a 30%  discrepancy  was 
found  between  theory  and  experiment  for  lightly  rusted  iron  foils,  clearly  indicating  a 
change  in  the  conductivity  of  the  rust  layer.  Subsequently,  more  heavily  rusted  foils 
were  prepared,  for  which  it  was  possible  to  measure  the  return  loss  directly,  and  the 
modulation  technique  was  not  pursued. 

Direct  return  loss  measurements  were  made  using  a rectangular  electro- formed 
copper  cavity.  The  cavity  was  approximately  one  wavelength  long  at  resonance  and 
coupling  was  via  a circular  iris,  whose  diameter  was  chosen  so  that  the  cavity  was 
under-coupled.  The  measurement  circuit  is  shown  in  Fig.  1(a)  and  is  driven  by  a 
swept  klystron  oscillator.  By  keeping  the  incident  power  constant  and  measuring  the 
power  reflected  from  the  cavity  via  a precision  attenuator,  the  reflection  coefficient 
could  be  measured  with  a reproducibility  of  0.  1 dB. 

The  cavity  itself  is  shown  in  the  cross-section  in  Fig.  1(b).  A 2.5  mills  thick 
teflon  sheet  was  used  between  the  choke  flange  and  foil  to  eliminate  changes  in  contact 
resistance  as  a result  of  tightening  the  clamping  ring  or  stressing  the  foil.  A micro- 
meter drive  was  connected  to  the  drive  pin,  which  was  in  turn  attached  to  the  foil. 
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Iron  foils  were  rusted  by  placing  2 mil  thick  stock  in  an  oven  at  70°C,  together  with 
a dish  of  water  to  provide  vapor.  X-ray  analysis  indicates  that  the  rust  layer  is  com- 
posed of  a mixture  of  Fe^O^,  Fe^O^,  FeO  and  FeO(OH).  A crude  estimate  of  the  thick- 
ness t of  the  rust  layer  can  be  obtained  by  subtracting  the  initial  foil  thickness  from 
that  of  the  rusted  foil,  and  dividing  by  two.  For  the  foil  described  below,  t = 1 . 2 mil . 

Starting  from  the  zero  position,  the  drive  pin  was  moved  in  a distance  0.5mm 
and  then  withdrawn  to  the  zero  position,  and  then  pulled  out  a distance  0.5  mm.  The 
measured  resonant  frequency  of  the  cavity  and  return  loss  at  resonance  are  plotted  in 
Fig.  2 as  a function  of  pin  displacement  for  one  of  the  foils  tested.  Hysteresis  effects 
are  observed  in  these  curves,  which  may  be  due  to  slippage  of  the  foil  in  the  clamping 
ring. 


MHz 


-0.4  -0.2  0 02  0.4 

Fig.  2.  Measured  values  of  resonant  frequency  and  return 
loss  vs.  pin  displacement. 

While  the  return  loss  curve  is  more  irregular  than  the  resonant  frequency  curve, 
there  is  a systematic  change  with  strain.  Such  a change  was  not  observed  for  a pure 
iron  foil,  and  is  therefore  attributed  to  the  presence  of  the  rust.  Because  the  cavity  is 
undercoupled,  increase  in  return  loss  indicates  a decrease  in  the  loss  in  the  foil. 
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C.  Evaluation  of  Surface  Resistance  and  Plane  Wave  Reflection  Coefficient 

In  order  to  determine  the  surface  resistance  of  the  foil  and  its  plane  wave  reflec- 
tion coefficient  from  the  return  loss  of  the  cavity,  the  equivalent  circuit  of  Fig.  3 is 


Fig.  3.  Equivalent  circuit  for  an  iris-coupled  cavity 
with  end-wall  surface  impedance  Zj^. 

used  to  describe  the  cavity.  The  shunt  susceptance  -jB  with  B > 0 represents  the 
coupling  iris  of  the  cavity.  The  transmission  line  has  wavenumber  k,  which  must  be 
taken  complex  in  the  form 

K = P - ja  (1) 

to  account  for  wall  loss.  The  admittance  of  the  line  for  the  dominant  mode  of  the  wave- 
guide is 


Y = 


(2) 


where 


Y 

o 


(3) 


is  the  permeability  of  free  space,  and  oj  is  the  radian  frequency.  Because  the  area 
of  the  wall  containing  the  coupling  iris  is  only  l/lO  that  of  the  side  walls,  for  simplicity 
the  loss  in  the  iris  is  neglected. 

For  a conducting  end  wall  « 1,  while  near  resonance  tan  kI  = (Ki  - nu), 

where  n is  the  number  of  half-wavelengths  nearest  to  the  cavity  length.  With  these  ap- 
proximations, the  input  impedance  to  the  cavity,  normalized  to  Y^,  is 


in 


(Ri  + «i  ) - j(XL  + p i - nir) 

= -jB'  +— 2 Z 

(R^+  a 1)  + (XJ^  + Pi  - mr)'^ 


(4) 
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where 


B'  ■=  B/Y 


(5) 


Y Z- 
o L 


At  the  re8.onant  frequency,  the  input  admittance  Y!  is  a real  conductance  G.' 

in  in 


given  by 


2{B')^  (RL  + a i) 

^ . (6) 

1 - [2B'(RL  + a f)]  ^ 

JU 

Solving  Eq.  (6)  for  R^  gives 


G! 

in 


(B-)^  + (GJ^)^ 


- at 


(7) 


The  conductance  G?^  can  be  found  from  the  return  loss  p,  i.e.,  the  magnitude  of  the 
reflection  coefficient  from  the  cavity  load,  via  the  expression 


g; 


in 


= ^ P 

1 - p 


(8) 


appropriate  for  the  undercoupled  cavity  used.  Thus,  measuring  p,  we  can  compute 
Rj^  knowing  the  cavity  parameters  a I and  B'.  The  parameters  were  determined  to  be 
at  = 6. 12  X 10  ^ and  B'  = 43.  5 . 


Using  the  method  outlined  above,  the  normalized  surface  resistance  R'  and  the 
actual  surface  resistance  Rj^  were  computed  for  values  of  return  loss  p nominally  cor- 
responding to  the  zero  point  and  two  end  points  of  the  pin  travel  in  Figure  2.  These 
values  are  listed  in  Table  I,  together  with  the  normalized  input  conductance  GJ^  from 
Equation  (8).  The  remaining  data  in  the  table  is  discussed  below.  It  is  seen  that  the 
surface  resistance  R^  changes  by  a factor  of  1.6. 

In  order  to  correlate  the  change  in  surface  resistance  with  a change  in  the  con- 
ductivity a of  the  rust  it  is  necessary  to  postulate  an  electrical  model  for  rust  layer 
over  iron.  If  we  were  to  assume  the  rust  to  be  very  thick  compared  to  the  skin  depth 
for  it,  then  the  surface  resistance  would  be  given  by  R^  = Jcj  u /Z<y  . Since  R,  de- 
creases  as  the  foil  is  flexed  into  the  cavity,  the  thick  rust  model  would  imply  an  in- 
crease in  the  conductivity  <r,  which  is  contrary  to  the  expectation  that  positive  strain 
causes  the  rust  grains  to  separate  slightly  thereby  decreasing  conductivity. 

An  alternate  model  that  was  considered  represents  the  rust  as  a homogeneous 
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TABLE  I.  Measured  and  computed  parameters 
for  rusty  iron  foil. 


Pin  Position 

in 

0 

out  J 

p(dB) 

7 

6.2' 

5 j 

g: 

in 

2.62 

2.92 

3.  56 

R^(xlO^) 

0.77 

0.93 

1.26 

^L 

0.412 

0.497 

0.  674 

(7(X10'^) 

5.9 

7.5 

11.5 

|r| 

.9978 

.9973 

.9964 

layer  of  conductivity  a above  the  iron  substrate.  The  surface  impedance  R.  + jX.  and 

plane  wave  reflection  coefficient  were  computed  as  a function  of  layer  conductivity  and 

layer  thickness.  In  Fig.  4 we  have  plotted  the  surface  resistance  R as  a function  c 

•L^ 


Fig.  4.  Surface  resistance  R^  vs.  layer  conductivity  for 
rust  layers  of  thickness  t over  an  iron  substrate. 
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for  different  values  of  layer  thickness  t,  assuming  the  frequency  to  be  10  GHz,  It  is 
seen  that'for  finite  t,  increases  with  conductivity  to  a maximum  and  then  becomes 
asymptotic  to  the  curve  for  an  infinitely  thick  rust  layer.  The  maximum  occurs  at  a 
value  of  (T  nearly  equal  to  that  for  which  the  thickness  t is  equal  to  the  skin  depth  6 = 

•^2/ u fia  . 

Since  single  crystal  FeO^  has  conductivity  10  mhos/m,  while  Fe^O^  and  FeO 
have  much  smaller  conductivity,  the  effective  conductivity  of  the  rust  should  be  less 
than  10  mhos/m.  Thus,  for  rust  layers  whose  thickness  is  in  the  range  of  1 to  2 mil 
the  surface  resistance  Rj^  increases  with  conductivity. 

The  values  of  cr  listed  in  Table  I were  taken  from  the  curves  of  Fig.  4 assuming 
the  thickness  of  the  rust  layer  to  be  t = 1 . 2 mil.  Because  the  estimate  of  layer  thick- 
ness is  very  crude,  the  values  of  conductivity  for  each  foil  may  be  off  by  a factor  that 
differs  significantly  from  unity.  However,  the  fractional  change  in  conductivity  is  not 
affected  by  errors  in  layer  thickness.  It  is  seen  that  a changes  by  a factor  of  almost 
two  as  the  foil  is  moved  from  the  "in"  position  to  the  "out"  position. 

The  modulation  in  a causes  a change  in  the  plane  wave  reflection  coefficient  F. 
The  dependence  of  F on  ct  is  depicted  in  Fig.  5,  where  (F  ( and  arg  F are  plotted  as  a 
function  of  a for  various  thicknesses  of  rust  layers.  Using  these  curves,  the  values  of 
I F I have  been  listed  in  Table  I corresponding  to  the  various  values  of  a . It  is  seen 
that  the  strain  in  the  rust  film  induces  a change  A j F I in  the  magnitude  of  F that  is 
1.5x  10'^. 

D.  Conclusion 

It  is  found  that  when  a polycrystal  rust  layer  is  strained,  its  conductivity  at  X- 

-4  -4 

band  frequencies  can  undergo  large  changes.  Strain  changes  from  +10  to  -10  on 

either  side  of  the  unstrained  state  lead  to  changes  in  conduc<-ivity  of  almost  a factor  of 

two.  The  change  in  conductivity  leads  to  a change  in  the  plane  wave  reflectivity.  Thus, 

since  | F | = 1,  the  X-band  radar  return  from  a rusty  plate  vibrating  at  frequency  f 

-4 

with  induced  strain  amplitude  of  10  would  show  an  amplitude  modulation 
E^[l  + a|F|  co8{2irf^t)] 

where  E^  is  the  amplitude  of  the  field  measured  for  a perfectly  conducting  plate  and 
a|F  I = 1.5x  10'^  or  -58dB. 
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Fig.  5.  Plane-wave  reflection  coefficient  vs . conductivity  for 
rust  layers  of  thickness  t over  an  iron  substrate: 

|r|:  arg  r. 
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NEW  SOLID  STATE  MATERIALS 

E.  Banks,  C.  Bush,  M.  Greenblatt,  B.R.  McGarvey,  S.  Nakajima  and  M,  Shone 


Research  in  the  area  of  new  materials  is  devoted  to  synthesis,  crystal  growth 
and  characterization  of  materials  having  interesting  electrical,  optical  and  magnetic 
properties  which  may  provide  a basis  for  new  optical,  electro-optic  and  magneto- 
optic devices.  Section  A describes  the  continuation  of  studies  of  complex  transition 
metal  fluorides.  Section  B describes  the  discovery,  synthesis  and  fluorescence  of 
new  mixed  fluorides  of  magnesium  and  the  divalent  rare  earths,  Eu  and  Sm,  EuMgF ^ 
and  SmMgF  ..  Section  C describes  the  synthesis  of  some  solid  solutions  based  on 
ZrP^Oy,  in  which  Li  ions  are  introduced  into  large  holes  in  the  structure,  and  dis- 
cusses the  possibility  that  these  materials  may  be  solid  electrolytes  for  possible 
advanced  battery  applications.  In  Section  D,  we  briefly  summarize  work  on  the 

o I •ax 

CdF-:Yb  , Er  infrared-visible  conversion  phosphor,  designed  to  detect  the  pref- 

2 o.  3^ 

erential  occurrence  of  closely- spaced  Er  -Yb  centers  to  explain  the  high  up- 
conversion  efficiency. 

A.  Complex  Transition  Metal  Fluorides 

Research  during  the  past  year  on  ternary  fluorides  of  the  general  type 
K M^mP^  F-  has  been  confined  to  the  systems  where  M^^  = Fe^^,  Mn^^ and M^^=Fe^^^  , 
in  the  composition  range  (0.  4 < x < 0.  6)  where  the  crystals  have  the  "tetragonal 
bronze"  structure,  known  to  be  ferroelectric  in  some  oxide  systems.  Bridgman  pull- 
ing of  melts  yielded  boules  containing  crystals  large  enough  for  X-ray  analysis.  At 
a nominal  composition  of  Kq  5^®0  5^3’  obtained  a single  crystal  suitable 

for  a structure  analysis.  Precession  photographs  showed  the  unit  cell  to  have  its  "a" 
axis  equal  that  of  the  tetragonal  bronze  cell,  and  the  "c"  axis  was  doubled.  Previous- 
ly, we  had  found  an  orthorhombic  superstructure  in  Kq  ^FeF^  with  a'  = 2 JTa  and 
c'  = 2c.  The  magnetic  unit  cell  found  from  powder  neutron  diffraction  also  showed  a 
doubled  "c"  axis,  but  no  evidence  of  a superstructure  was  seen  in  the  "a"  direction. 
Single  crystal  intensity  data  have  been  collected  on  Kq  5^®0  5^3  ^ structure 

refinement  is  now  in  progress.  When  the  positional  and  thermal  parameters  of  the 
atoms  are  accurately  known,  we  should  be  in  a better  position  to  interpret  the  neutron 
diffraction  data,  and  decide  whether  single  crystal  neutron  data  will  be  required. 

B.  Fluorescence  in  New  Divalent  Rare  Earth- Magnesium  Fluorides 

Several  years  ago,  many  papers  appeared  concerning  new  ternary  fluorides  of 

2+  2+  2+ 

formula  type  BaMF^,  where  M = Mn  , Fe  Co  , etc.  These  materials  have  an 
orthorhombic  crystal  structure,  are  ferroelectric  at  and  below  temperature, 
and  show  antiferromagnetic  ordering  below  room  temperature.  At  that  time. 
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we  thought  it  would  be  of  interest  to  prepare  analogous  compounds  with  Eu^"*"  in  place 
of  Ba.  An  attempt  to  prepare  EuFeF^  was  made,  but  we  discovered  that  EuF^  re- 
duces FeF^  to  iron  metal,  forming  Recently,  reports  appeared  on  the  non- 

linear optic  behavior  of  BaMgF^  and  BaZnF^,  ^ with  the  same  crystal  structure  as 
BaMnF^,  etc.  Since  magnesium  is  reduced  to  the  metal  with  much  more  difficulty 
than  the  transition  metals,  we  decided  to  attempt  the  preparation  of  EuMgF^.  Initial 
attempts  in  Pt  tubes  led  to  the  appearance  of  a few  new  weak  lines  and  a blue  fluores- 
cence under  U.  V.  excitation,  characteristic  of  Eu^^.  The  platinum  was  blackened, 
indicating  alloying  of  Mg.  This  led  us  to  attempt  the  use  of  carbon  crucibles.  Mix- 
tures of  EuF^  and  Mgp2  temperatures  below  900°  showed  very  little  sign  of 

reaction.  Heating  to  1100°  produced  melting  and  the  melt  showed  only  starting 
materials.  This  suggested  an  incongruently  melting  compound,  and  very  slow  solid 
state  kinetics.  It  was  then  decided  to  attempt  the  reaction  via  Mg  vapor.  We  mixed 
stoichiometric  amounts  of  EuF^,  MgF^  and  Mg  metal  and  heated  the  mixture  in  a 
sealed  graphite  crucible  at  800-900°C  for  several  hours.  The  product  showed  spotty 
sites  with  blue  fluorescence.  Thus  far,  the  most  successful  preparations  were  made 
by  adding  Mg  metal  to  the  top  of  a mixture  of  EuF^  and  MgF^-  The  product  after 
heating  was  a Single  phase  with  a new  X-ray  pattern  whose  major  lines  can  be  indexed 
on  an  orthorhombic  cell.  This  cell  is  comparable  to  the  barium  compounds,  with 
smaller  dimensions.  It  has  a brilliant  blue  fluorescence  under  ultraviolet  excitation. 
Such  high  brightness  at  room  temperature  in  a stoichiometric  compound  is  quite 
unusual,  as  interaction  among  the  activator  ions  usually  quenches  luminescence.  We 
have  also  prepared  SrMgF^,  whose  unit  cell  is  almost  identical  with  that  of  EuMgF^, 
as  expected  from  the  similarity  in  ionic  radii  of  Eu^"*"  and  Sr  . This  also  appears 


to  be  incongruently  melting,  as  a melted  sample  yielded  the  patterns  of  the  starting 
materials.  Theother  successful  preparation  using  the  magnesium  vapor  method  was 
SmMgF  , with  a unit  cell  similar  to  that  of  EuMgF^.  The  unit  cells  are  compared  to 

that  of  BaMgF^  below; 

a (A) 

bj^ 

c (A) 

BaMgF^ 

5.  81 

14.  509 

4.  125 

SmMgF  ^ 

5.  88 

14.  48 

4.  19 

EuMgF^ 

5.  82 

14.  39 

4.  14 

The  Sm  and  Eu  compound  cell  parameters  decrease  as  expected  for  the  lanthanide 
contraction.  The  fact  that  some  of  the  Ba  compound  parameters  are  unexpectedly 
smaller  is  not  understood;  it  may  indicate  that  the  compounds  are  not  actually  iso- 
structural.  We  are  now  preparing  the  solid  solution  samples  Srj_^Eu^MgF^  for  a 


*lw  «4 
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study  of  the  fluorescence  and  excitation  spectra  as  a function  of  composition.  The 
bright  blue  emission  of  these  materials  suggests  that  they  may  be  useful  X-ray  de- 
tectors for  applications  connected  with  films  or  photosensitive  detectors.  We  are  now 
attempting  the  preparation  of  thin  films  of  EuMgF^  by  vacuum  evaporation,  as  such 
films  can  be  used  in  evacuated  image  tubes.  We  are  also  attempting  to  grow  single 
crystals  from  off- stoichiometric  melts.  The  broad,  intense  luminescence  in  this 
compound  suggests  the  possibility  that  it  may  be  a material  for  a solid-state  laser 
which  could  be  tunable  over  a portion  of  its  emission  spectrum,  and  if  it  proves  to  be 
ferroelectric,  the  possibility  of  modulating  its  luminescence  by  electric  fields  may 
be  worth  exploring. 

C.  Synthesis  of  Potential  Solid  Electrolytes 

Solid  electrolytes  have  become  important  in  the  development  of  rechargeable 
batteries  of  high  energy  density  for  applications  to  electric  vehicles  and  utility  load- 
leveling. The  most  advanced  systems  of  this  sort  use  a sodium- ion  conductor  based 
on  NaAljjOj^  (^-alumina)  which  has  two-dimensional  ion  conduction  pathways  and  is 
inherently  a poorer  electrolyte  than  a material  which  might  have  three-dimensional 
connections  among  the  cation  sites,  which  would  be  partially  occupied  to  permit  free 
movement  of  ions  from  one  site  to  the  others. 

We  have  attempted  to  introduce  alkali  metal  ions  into  the  large  interstices  in 

cvtbic  ZrP^O^  in  the  hope  that  these  ions  would  *^e  mobile.  The  large  interstices  can 

accommodate  ions  of  radius  up  to  about  1.  2 A,  and  they  are  connected  three-dimen- 

3+ 

sionally.  The  first  preparations  were  made  using  Y as  the  charge-compensating 
species;  This  resulted  in  appreciable  solid  solution  formation  only  in  the  case  of 
lithium.  Preparations  of  composition  Zr  Y Li  P-0_  showed  a cubic  phase  for 
values  of  x up  to  0.  1.  Higher  doping  levels  showed  no  further  increase  in  lattice 
parameter  and  the  appearance  of  a second  phase  pattern  in  X-ray  diffractograms. 

No  solid  solution  was  observed  for  combinations  of  yttrium-sodium  and  gallium- 
lithium.  A sample  of  Zr^  ^Yq  been  submitted  for  ^Li  NMR  to  de- 

termine whether  the  lithium  ions  are  mobile  at  temperatures  up  to  200°C.  If  mobility 
is  detected,  we  plan  to  evaluate  the  conductivity  on  the  basis  of  dielectric  loss 
measurements. 

On  the  basis  of  ionic  radius  considerations,  it  was  thought  that  these  prepara- 
tions might  have  lithium  on  normal  Zr  sites  and  yttrium  in  the  large  interstices. 

The  X-ray  intensities  which  are  about  the  same  for  pure  ZrP^Oy  and  the  solid  solu- 
tion, discourage  such  an  interpretation.  With  Y in  the  Zr  sites,  the  10%  filling  of 
the  interstices  by  lithium  would  lead  to  a maximum  of  2%  change  in  intensity  for  the 
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most  sensitive  reflections,  whereas  the  distribution  suggested  above  would  give  easily 
observable  intensity  changes. 

4+  3+ 

Trivalent  indium  has  a radius  much  closer  to  Zr  than  does  Y . We  are  plan- 

3+  .+ 

ning  similar  preparations  with  this  ion.  Some  preparations  with  Eu  - Li  sub- 

3+ 

stitution  are  also  planned.  The  photoluminescence  of  Eu  can  be  used  as  a probe  of 
the  Eu  environment,  as  shown  by  Blasse,  et  al.  ^ 

D.  Rare  Earth-Defect  Complexes  and  Infrared- Visible  Upconversion  in  CdF^ 

3+ 

Our  paper  on  the  infrared-visible  light  conversion  in  CdF  : Yb  has  been  pub- 

3 ^ 

lished  in  the  Journal  of  the  Electrochemical  Society.  Measurements  of  excitation 

spectra  were  unsuccessful  because  of  the  presence  of  second  order  radiation  from  the 

3+ 

source,  which  directly  excited  the  Er  radiation,  and  the  low  intensity  of  infrared 

1 9 

radiation  in  the  spectrometer  source.  We  have  recently  obtained  evidence  from  F 

3+ 

NMR  studies  on  a crystal  of  CdF^.  5%  YbF^,  that  approximately  50%  of  the  Er  and 

Yb^^  ions  are  involved  in  mixed  (Er^^,  Yb^^,  2Fj^  ) dimers.  This  was  in  accordance 

with  our  proposal  to  account  for  the  unusually  high  upconversion  efficiency  in  the  phos- 

3+  . 3+ 

phor  crystals.  In  that  case,  the  concentration  of  Er  ions  is  only  10%  that  of  Yb  , 

3+ 

so  almost  all  of  the  Er  ions  might  be  in  such  clusters. 
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LARGE  SIGNAL  TRANSIENT  RESPONSE  OF  MOSFETS 
R.  T.  Kinasewitz  and  B.  Senitzky 

A.  Introduction 

Most  studies  of  the  current-voltage  characteristics  of  the  noetal-oxide-semi- 
conductor  field-effect  transistor  (MOSFET)  have  been  directed  towards  its  D.C.  and 

A.  C.  (small  signal)  behavior. 

Our  investigation  is  directed  towards  identifying  some  of  the  physical  phenomena 
that  are  responsible  for  the  large-signal  turn-on  transient  response  of  the  MOSFET. 

B.  Research 

In  the  last  report  we  described  our  experimental  observation  of  an  unexplained 
overshoot  in  the  transient  response  of  a p-channel  enhancement  MOSFET.  Since  then 
we  have  experimentally  probed  for  the  cause(s)  of  the  overshoot. 

One  series  of  experiments  consisted  of  exciting  the  MOSFET  with  a D.  C.  biased 
gate-to-source  signal  to  determine  the  effect  on  the  output  transient  of  an  input  D.  C. 
bias  voltage.  The  bias  circuit  shown  in  Fig.  1 was  installed  in  a General  Radio  Type 
874-x  insertion  unit. 


SIGNAL 
FROM  PULSE 
GENERATOR 


TO  OFFSET  VOLTAGE 
POWER 
SUPPLY 

22Kil^  >22Kn 


^SIGNAL  TO 
‘mosfet  circuit 


Fig.  1.  Offset  voltage  circuit. 


Typical  results  are  shown  in 


Figures  2(a)  and  2(b). 
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input 

Vert,  scale 


D,  C.  gate -to-source 
bias  voltage  = V_._ 


input 

Vert,  scale 


Fig.  2.  MOSFET  output  vs.  D.  C.  gate-to-source  voltage 


In  Fig.  2(a)  and  2(b)  it  is  seen  that  a gate-to-source  bias  voltage  rnore  negative 
than  -6V  practically  eliminates  the  positive-going  overshoot  thus  showing  that  this 
overshoot  is  an  intrinsic  device  characteristic  dependent  on  the  input  voltage  and  not 
merely  an  extrinsic  capacitance -inductance  effect.  Extrinsic  capacitances  and  in- 
ductances are  essentially  linear  parameters  whose  transient  response  is  unaffected  by 
D.  C.  inputs:  however,  in  this  experiment  the  D.  C.  level  of  the  input  signal  affected 
the  transient  response  waveshape  very  much  indeed. 


Fig.  4.  MOSFET  output  vs.  D.  C.  substrate-to-source  bias. 

Figure  4 shows  that  a substrate-to-source  D.  C.  voltage  that  back  biases  the 
source- substrate  P-N  junction  will  cause  a decaying  oscillation  of  the  output  transient. 

C.  Future  Work 

Our  studies  on  the  large  signal  subnanosecond  transient  behavior  of  MOSFET 
devices  is  continuing  and  we  hope  that  we  will  be  able  to  elucidate  some  of  the 
observed  behavior  in  a future  report. 
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Another  series  of  experiments  consisted  of  biasing  the  MOSFET  with  a D.  C. 
substrate-to-source  voltage  (^gg)  determine  its  effect  on  the  output  transient.  The 
MOSFET  circuit  was  modified  to  contain  the  substrate-to-source  biasing  voltage  and  is 
shown  in  Figure  3. 


DRAIN 
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SUBSTRATE 


2200il 


OUTPUT 


SOURCE 


INPUT 


Fig.  3.  MOSFET  circuit  with  D.  C.  substrate-to-source  bias. 
Typical  results  are  shown  in  Figure  4. 


VjjD=40V 

V.  ^ = -25V 
input 


Vert,  scale  = 40  ma/Div 
Hor,  scale  = 1 nsec/Div 


= OV,  2V 
= lOV 
= 30V 


WAVE-MATTER  INTERACTIONS 


185 


TURBULENCE  IN  PLASMA-UKE  SYSTEMS 
S.  Barone  and  N.  Marcuvitz 

An  unobjectionable  self-consistent  description  of  turbulent  electromagnetic 
fields  in  plasma-like  systems,  wherein  charges  and  currents  are  nonlinearly  field 
dependent  is  not  known.  Renormalization  procedures  applicable  in  a linear  and 
quasi-linear  range  fail  with  the  onset  of  marked  nonlinearities.  An  initial  approach 
to  these  difficulties  is  being  explored  via  renormalization  of  higher  point  correlation 
of  the  electromagnetic  fields.  For  simplicity  of  illustration  we  consider  an  electron 
plasma. 

A.  Two-Point  Distribution  Functions  for  the  Electromagnetic  Field 

Consider,  for  simplicity,  an  electron  plasma  (charge  -e,  mass  m)  with  a fixed 
neutralizing  background.  Maxwell's  equation  may  be  written 

. E = e J (dv)  V f (v ) , 

where  the  dyadic  operator  Yj^  is  the  same  as  in  I and  f is  the  stochastic  electron 
distribution  function.  According  to  Eq.  (37a)  of  Ref . (1) 

f = f ® + G T^  f^ 

where  we  have  defined  f^  s G T].  If  T^  is  approximated  by 


Maxwell's  equations  take  the  form 


where 

2 

’^e  ■ ^ j (dv)  V G(v,  v')(dv')  7^/  f^/)  , 

J s (-e)  j (dv^)  V f ^ (v)  + • E 

These  equations  have  the  same  structure  as  the  Klimontovich  equation  so  that  the 
techniques  of  Ref.  (1)  apply  directly. 

Following  Ref.  (1)  we  introduce  a stochastic  field  Green's  function  Z by 

(Y^  - V ) . i = 1 , 
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where  Y s y.,  - V . The  ensemble  average  field  Green's  function  satisfies 
o M e ® 

(Y  - V ) • Z = 1 
o ce 

which  might  be  taken  as  a defining  equation  for  A T^  operator  is  defined  by 

Z = Z + Z • T • Z 
ce 

The  stochastic,  two  point  distribution  function  for  the  field  satisfies 
(Yo  - V^)i  (Y^  - V^)2  : E(l)  E(2)  = J(l)  J(2)  . 

Defining 


Yi2  = Y^(l)  Y^(2)  +<V^(1)  V^(2)> 

Y^(l)  yj2)  + V^(l)  Y^(2)  +<V^(1)  V^(2)>  - V^(l)  V^(2)  , 

this  may  be  written 

(Yj2  - E(U  E(2)  = J(l)  J(2)  . 


A two-particle  stochastic  Green's  function  may  be  introduced  by 

The  average  satisfies 

(2) 

which  defines  , Also  a two-particle  version  of  may  be  defined  by 


^12=^12^^12^  ^12 


is  related  to  T^^^  by 
ce  ce  ’ 


v(2)  _ ^ ^2)  ; z . x(2) 

ce  e 12  ce 

The  solution  for  E(l)  £(2)  is 

E(l)  E(2)  = Zj^:  3(1)  J(2) 


Taking  the  ensemble  average  we  have 
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< E(l)  E(2)  > = 2,2=  < jjl)  J(2)>  + < 2^2=  : 2^2=  J(U  J(2)  > • 

The  differential  equation  characterization  of  < E(l)  E(2)>  is 

(Yi2  - < E(l)  E(2)>  = < J(  1)  J(2)  > + < T^^^^  2 ^ 2=  J(  0 J(2)  > . 

In  terms  of  the  fully  interacting  single  particle  Green's  function  2,  this  equation  reads 
(2;*  2'^  - I^^^  : < E(l)  E(2)  > = B ^ 2 : < J( 0 J(2)  > . 

where 

j(2)  ^ ^(2)  V (1)2;‘  z/  V (2)  - V (1)V  (2)-<V  (1)V  (2)  > . 

e ce  ce  2 1 ce  ce  ce  ce  ce 

and  a new  operator  Bj2  been  introduced  by 

Bj2:<  J(l)  J(2)  > = <(1  + T5,^j  2j2):  J(1)  J(2)  > • 

The  proper  determination  of  the  operator  B,-  involves  a renormalization  of  the 

--  A 1 A A 2 

strength  of  the  coupling  between  < E{1)  E(2)  > and  < J{1)  J(2)  >. 

B.  Kinetic  Equations  for  Interacting  Quasi -Particles 

(21 

When  the  effect  of  the  interaction  operator  I'  ' is  small  it  is  useful  to  define 

® - 1 

renormalized  collective  modes  or  wave  types  by  (Y  s Z ) 

Y . E^  = 0.  a=  1.  2.  .. 

for  a weakly  inhomogeneous  system 
Det  Y(k,  w)  = 0 

where  Y(k,  oj)  is  the  Fourier -Laplace  amplitude  of  the  fast  (r,  t)  dependence  of  Y and 
the  slow  space -time  dependence  of  the  transform  has  been  suppressed.  The  above 
equation  defines  complex  mode  frequencies 

^ ' a = 1,  2 . . . 

Because  of  the  implicit,  slow  space-time  dependence  of  Y(k,  ut)  the  mode  dispersion 
relations,  and  growth/decay  rates  Yjj(k)  are  implicitly  slowly  varying  with 

position  and  time.  These  modes  fall  into  two  classes  generally  referred  to  as  proper 
and  improper  or  quasi -modes.  ^ 

Suppose,  for  simplicity,  that  there  is  no  background  field  i.e.  E = 0.  Then  in 
transform  space 
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Y*<2)  • < E(l)  E*(2)> 

= Z(l)  • [Bi2  < J*(l)  J**(2)  > + < id)  i*(2)  > ] (1) 

where  I,  2..  . etc.  now  denote  the  transform  variables  (k^,  w^,  (k2»  ^2} 

2 has  reduced  to 

J**(-e)  J (d^v£*(v) 

The  above  result  is  to  be  compared  with  Eq.  (55)  of  Ref.  (I).  Following  the  procedure 
outlined  in  Ref.  (1)  we  now  have 


Y*(k+  iV,  w-ia^)  • t)  = Z(k,  w)  • t)  , 


where 


d«'  - 

/ 


- s a r ' dw"  Z 

* J 771:1  5kw  5k  * 


i(k*.  r -w*t) 


k'  ■ k - k',  w'  Bw-u',  the  slow  space -time  dependence  has  now  been  made  explicit, 

and  (r , t)  is  related  to  the  bracketed  term  in  £>}.  (i)  in  the  same  way  that  t) 

is  related  to  < Ej^  E*^>  i.  e. 

* 

V'--**  ■ L VkV* 


Integration  over  the  w-plane  contour  discussed  in  Ref.  (1)  yields 
Y*(k+iV.  ^ t)  = (i — ' 


aw. 


- a 


where  , r,  t)  is  the  kinetic  phase  space  distribution  function  for  a type  quasi- 
particles of  momentum  k and  position  r.  When  the  mode  frequencies,  w^  are  suffi- 
ciently close  to  the  real  axis  we  have,  again  following  Ref.  (1) 


( a.  + «-  • V - • V,, ) = 2 Y_  3 

* “ aka  a a [8B(k,  f 


where,  for  real  (k,  w) 


Y(k,«)  B G(k,  w)  - iB(k.«)  , 


.:ik 
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dB(ka)) 


a 


and,  for  simplicity,  we  have  assumed  an  isotropic  situation.  The  above  equations  are 
a set  of  coupled  kinetic  equations  for  the  quasiparticle  distribution  functions 


C.  Klimontovich  Equation  for  Qtiasi  Particles 


When  quasi -particles  are  treated  at  a level  of  approximation  that  includes  inter- 
actions, a Klimontovich  description  is  desirable.  If  a Klimontovitch  equation  can  be 
derived  for  a stochastic  phase  space  distribution  function  for  quasi  particles,  all  of 
the  techniques  developed  for  dealing  with  real  particles  can  be  applied  to  quasi  particles. 
In  particular  quasi  particles  will  be  scattered  by  real  particles  as  well  as  other  quasi 
particles,  localized  bound  states  will  occur  and  also  collective  oscillations  involving 
quasi  particles.  The  latter  will  have  a spectrum  of  discrete  modes,  some  of  which 
define  new  quasi  particle  types.  Presumably  the  degree  of  turbulent  excitation  of  a 
system  determines  the  degree  of  excitation  of  the  various  levels  of  quasi  particles. 

The  derivation  of  a Klimontovitch  equation  for  quasi  particles  follows  from 
(Zj^  Z'^  - I^^^)  E(l)  E(2)  = (1  + T<^)  Zj^)  J(1)J(2)  . 


We  now  find 


» <^kco 

= 2Y  S ^ j 

^ [aB(k, 

where  t)  is  the  stochastic  phase  space  distribution  function  for  quasi^articles 

of  a type.  iSj^is  derived  from 

r du>"  - ^ ) 

■■  / 3 TW  ?kw5k'w'® 

« (2ti) 

in  the  same  way  that  is  derived  from  Notice  that  (S^^is  the  ensemble  average 

of  3^.  The  left  hand  side  of  the  above  equation  describes  the  streaming  of  a type 
quasi-particles  under  the  averaged  influence  of  the  rest  of  the  system  i.e,  under  the 
influence  of  the  ensemble  averaged  "forces".  The  term  describes  the  production 
orabsorption  of  a type  quasi  particles.  The ^ team  is  defined  by 
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(dk") 


dw"  r 

2TT  L 


i(k".  r -w"t) 


kw,  k 'u* 


where  (assuming  E = 0) 

[ (1  + T^^^^  Z^2)  J^(l)  J^*(2)  + E(l)  i*(2)  . 

Clearly  the  J J term  describes  the  stochastic  excitation  of  the  system  by  the  external 
particle  sources  T)  . The  EE  term  can  be  expressed  in  terms  of  the  Fourier -Laplace 
transform  of  (?j^^(r,  t)  which  in  turn^s  a linear  combination  of  all  the  <^jj(kf  r,  t).  Thus 
the  above  equations  are  a coupled  set  for  all  of  the  stochastic  distribution  functions 

S (cl  = 1,2...).  These  equations  have  a linear  appearance.  However,  there  are 

^ (2)  (2) 

implicit  nonlinearities  in  T , ^^2’ 
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HIGH  POWER  MICROWAVE  PROPAGATION  THROUGH  THE  ATMOSPHERE 
S.  Barone,  N.  Marcuvitz,  R.  Pascone  and  N.  Solimene 

This  is  a brief  report  on  the  continuing  feasibility  study  to  ascertain  conditions 
under  which  a high  power  microwave  beam  propagating  through  and  ionizing  the  atmos- 
phere can  be  sufficiently  channeled  to  prevent  spreading  of  the  beam  energy.  Our  ten- 
tative summary  view,  based  on  extrapolation  of  available  breakdown  data,  is  that  it  is 
possible  to  propagate  high  power  nanosecond  microwave  pulses  through  the  atmosphere 
provided  that  the  intensity  and  pulse  length  are  properly  selected.  Further  study  is 
under  way  to  determine  whether  the  self-induced  plasma  entrained  in  such  pulses  can 
be  structured  to  provide  focussing  sufficient  to  offset  diffractive  spreading  of  the  pulses. 

A.  CW  Propagation 

The  possibility  of  self- focus  sing  and  self- trapping  of  radiation  fields  with  beam 
diameters  comparable  to  or  larger  than  a free  space  wavelength  in  fully  ionized, 
asymptotically  uniform  plasma  is  well-known.  The  possibility  of  a radiation  field  ion- 
izing an  initially  neutral  atmosphere  in  such  a way  as  to  self-focus  or  channel  is  limited 
by  energy  considerations,  relaxation/growth  rates,  medium  hydrodynamics,  etc.  Sev- 
eral important  conclusions  can  be  drawn  on  the  basis  of  energy  considerations  alone. 

In  particular  approximately  60  joules  are  required  to  singly  ionize  every  molecule  in 
one  cubic  centimeter  of  air  at  sea  level.  This  is  of  the  same  order  of  magnitude  as 
the  energy  per  pulse  available  from  state  of  the  art  high  power  microwave  sources. 

Thus  for  kilometer  path  lengths  (10^  cm)  say,  and  beam  diameters  of  the  order  of  a 
microwave  wavelength  (~lcm),  the  maximum  achievable  ionization  is  about  one  part  in 

10^.  The  density  of  neutral  molecules  (-10^'^/cm^)  is  essentially  unchanged.  For  this 

12 

density  of  neutrals  the  electron- neutral  collision  frequency  (v)  is  about  10  /sec  which 
exceeds  microwave  frequencies  (cj)  by  an  order  of  magnitude  or  more.  In  this  situation 
the  effective  linear,  non- relativistic  dielectric  constant  is 

e SI  - (-E,2  + i V (_E,2 
V 0)  V 

_ 4 

where  u =10  -JN  is  the  plasma  frequency  in  Hz  when  N is  the  electron  density  in 
P 3 5 

particles/cm  . For  the  maximum  possible  fractional  ionization  of  one  part  in  10  or 

14  3 

10'^  electrons/cm  , the  reduction  of  the  real  part  of  the  dielectric  constant  due  to  ioni- 
zation is  about  1%.  However,  for  this  electron  density,  the  imaginary  part  of  c is  about 
0.1  so  that  the  attenuation  length  due  to  atmospheric  heating  is  only  about  10  wavelengths. 
In  order  for  the  attenuation  distance  to  be  comparable  to  a path  length  of  10^  wavelengths 
the  electron  density  can  be  no  greater  than  10^^/cm^.  For  this  situation  the  real  part 
of  the  dielectric  constant  is  decreased  by  10  This  is  insufficient  to  cause  appreciable 
self- focussing  on  a cw  basis. 
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At  higher  altitudes,  say  70,000  feet,  atmospheric  density  is  smaller  by  about  a 
factor  of  100,  and  60  joules  will  singly  ionize  100  cubic  centimeters  of  air.  For  kilo- 
meter path  lengths  and  centimeter  diameter  beams,  the  maximum  achievable  fractional 
ionization  is  10  ^ . The  electron- neutral  collision  frequency  is  10^^/sec  which  is  less 
than  a typical  operating  frequency,  w =10^Vsec  by  an  order  of  magnitude.  Thus 


t = 1 


+ 


and  for  an  attenuation  distance  of  10^  wavelengths,  the  electron  density  again  cannot 
exceed  10  /cm  . For  this  situation  the  real  part  of  £ is  decreased  by  10  Again 

this  is  insufficient  to  cause  appreciable  self-focussing  on  a cw  basis. 

These  considerations  indicate  that  until  very  much  higher  power  levels  become 
available,  cw  channeling  is  limited  to:  (1)  small  beam  diameters  (<10  ^ cm)  within 
which  full  ionization  can  be  achieved  or,  (2)  short  pulses,  so  that  the  above  cw  consid- 
erations do  not  apply.  Self  trapped  beams  with  diameters  much  less  than  a free  space 
microwave  wavelength  are  possible  and  we  have  studied  their  properties.  The  excita- 
tion of  such  channels  poses  a difficult  technological  problem. 


B.  Propagation  of  Short  Pulses 


In  the  atmosphere,  electromagnetic  waves  of  power  density  of  the  order  of  a 
megawatt/cm  , depending  on  altitude,  produce  ionization  phenomena  that  radically 
modify  wave  propagation.  Ionization  gives  rise  to  a space-time  dependent  plasma  that 
for  finite  length  pulses  attenuates  the  tail  of  the  pulse  but  does  not  affect  the  leading 
edge  because  of  the  finite  time  for  the  plasma  to  build  up.  Accordingly,  for  sufficiently 
short  pulses,  it  is  possible  to  propagate  a' high  power  pulse  with  relatively  small  at- 
tenuation. Since  diffractive  spreading  of  such  pulses  on  propagation  beyond  the  Fraun- 
hofer distance  results  in  a decrease  in  wave  intensity,  it  is  of  interest  to  explore 
whether  nonlinear  effects  in  the  self- induced  plasma  can  result  in  sufficient  focussing 
to  compensate  for  wave  spreading. 


Our  preliminary  analysis  of  pulse  propagation  has  been  concerned  with  the  deter- 
mination of  optimum  pulse  size  as  a function  of  power  density,  altitude  (atmospheric 
pressure),  etc.  We  consider  at  first  only  ionization,  attachment,  recombination,  and 
wave  attenuation  effects  for  the  case  of  a one- dimensional  plane  pulse,  i.e.,  we  defer 
consideration  of  pulse  spreading  and  focussing  effects.  The  model  used  for  our  pre- 
liminary investigation  of  the  propagation  of  microwave  power  through  the  atmosphere 
is  given  in  terms  of  the  power  density  and  the  electron  number  density;  the  transverse 
spatial  dependence  has  been  neglected  in  this  initial  work.  The  normalized  equations 
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3P 

9t 


(3P  = 0 


■|t=“N-  VN‘ 


where  P is  the  power  density,  relative  to  the  cw  breakdown  power  density  and  N is  the 
electron  density.  The  unit  of  distance  is  such  that  the  velocity  of  light  is  unity.  The 
coefficient  a is  the  net  ionization  frequency: 


a 


V 

a 


where  is  the  ionization  frequency  and  is  the  attachment  frequency.  The  coefficient 
y is  the  recombination  coefficient.  These  are  empirical  functions  of  E and  p where  E 
is  the  effective  field  strength  and  p is  the  pressure.  The  attenuation  coefficient  p is 
given  in  terms  of  the  momentum  transfer  collision  frequency,  v,  and  the  electron  densi- 
ty, i.  e. , 

a _ V N - CO  ^ N 

P"2  2N"vN 

CO  + V o o 

where  N is  the  electron  density  corresponding  to  a plasma  frequency  equal  to  co,  i.e., 
o 

2 

m CO  £ 

Q 


In  general,  the  momentum  transfer  collision  frequency  is  also  a function  of  E and  p. 
However  for  simplicity  our  preliminary  work  has  taken  v/p  to  be  constant  and  equal  to 
5 xlO^  (sec- Torr) 

The  data  on  ionization  and  attachment  frequencies  are  often  given  in  terms  of  the 

coefficients  q.  and  q respectively  where 
1 ^ 


V. 

I 


V 

a 


= ^d^i 


and  V , is  the  electron  drift  velocity.  The  ionization  coefficient  is  well  represented  by 
a 

P 

where  for  air  A = 15(cm- Torr)’^  and  B = 365  V/(cm-Torr)  for  E/p  in  the  range  100-800 
V/(cm-Torr)  (see  Reference  1).  The  data  of  Harrison  and  Geballe^  for  E/p  in  the 
range  25-65  V/cm-Torr  may  be  fitted  with  A =3. 12  (cm-Torr)’^  and  B = 201  V/(cm- 
Torr).  The  latter  has  been  used  in  the  calculations  summarized  in  the  accompanying 
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computer  plots.  Other  representations  for  v.  have  also  been  used.  For  example, 

3 ‘ 

Mayhan,  etal.,  use 

^ = f(T.p)(-|)^-^^ 

P P 

where  f(T,p)  is  a given  function  of  temperature  and  pressure.  At  room  temperature 
f = 8. 35x10"'*. 

The  attachment  coefficient  data  is  somewhat  less  certain.  The  data  of  Harrison 

2 4 

and  Geballe  for  E/p  in  the  range  25-60  V/(cm-Torr)  and  that  of  Chatterton  and  Craggs 

in  the  range  2-30  V/(cm-Torr)  are  in  marked  disagreement  in  the  overlapping  region 

25  < E/P  < 30.  For  the  purposes  of  the  present  calculation  a rough  compromise  was 

struck  by  using 

\ ^k/E,2 

-—  = a + b(--) 

P P 

a ^ I / y 

with  a = 5 . 68  X 10  (sec-Torr)  and  b = 3.05x10  (cm-Torr/V)  /(sec-Torr). 

The  drift  velocity  data  used  is  that  of  Emeleus,  Lunt  and  Meek  as  given  on  page 
719  of  Loeb's  book^  for  E/p  in  the  range  20-595  V/{cm-Torr).  This  data  is  well  sum- 
marized by; 

V,  = 0.125  x 10^  ( — cm/sec 

when  E/p  is  given  in  V/(cm-Torr).  Since  v^  = eE/mv,  we  may  infer  the  momentum 
transfer  collision  frequency 

- = 1.41xl0^(  — (sec-Torr)"^ 

P P 

A cruder  fit  to  the  drift  velocity  data  is: 

V . = a + a ( — ) 
d o 1 P 

7 5 2 

where  a = 1.44x10  cm/sec  and  a,  = 1.15x10  cm  - Torr/(sec- V).  The  latter  was  the 
o '1 

form  actually  used  in  the  calculations  performed  to  date. 

The  recombination  coefficient  was  taken  from  the  data  of  Sayers  as  given  in  S.C. 

6 - 6 3 

Brown's  book.  The  value  of  y is  approximately  2.3x10  (cm  /sec)  at  one  atmosphere 

and  decreases  to  zero  as  the  pressure  is  lowered.  We  therefore  approximated: 

V 2.3x10*^  3,, 

■^  = 7^0 cm  /(sec-Torr) 
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A summary  view  of  the  current  state  of  our  analysis  is  contained  in  the  following 
series  of  computer  generated  graphs  displaying  the  space-time  dependent  variation  of 
power  density  in  a propagating  planar  pulse  for  various  parameter  ranges.  The  dis- 
plays are  three-dimensional  with  power  density,  POW(z,t),  relative  to  the  cw  break- 
down pKJwer  density  as  the  ordinate,  distance  z along  the  propagation  direction  as  the 
abscissa  (Zf  = 0.3  meters),  and  the  third  dimension  is  the  time  t measured  in  nano- 
seconds (Note:  one  nanosecond  corresponds  to  wave  travel  of  approximately  0. 3 meter). 
The  equations  as  well  as  the  ionization,  attachment,  etc.,  parameters  used  to  describe 
approximately  the  propagation  of  a pulsed  wave  train  are  indicated  in  the  legend  at  the 
top  of  each  graph.  The  captions  are  limited  to  upper  case  Roman  characters  and  the 
names  of  the  various  parameters  are  limited  to  four  characters.  For  these  reasons, 
the  following  dictionary  to  the  computer  graphs  is  provided; 

First  line; 

D/DT  - 

D/DT  - 

BETA  — p 
ALPH  — a 
GAMA  — V 

Note  also  misprint;  N(X,  T)  should  be  N{Z,  T). 

Second  line; 


BETA=W2/v>i={n/N0) 

means 

3 

P V N 

o 

ALPH=NUI-NUA 

means 

a = V.  - V 

, 

1 a 

NUI=VD*P*a*EXP(-  B/(E/P)) 

means 

A -Bp/E 

Vi  = v^p  A e 

Third  line; 

NUA=VD*P*(KI  + KJ*(E/P)' 2)  means  = v^p[a  + b(-^  )^] 

where 

KI  — a 
KJ  -*  b 

VD=KU  + KV*(E/P)  means  '"d  ” ®o  ^ 

J 

I ! 


I 
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where 

KU  — a 

o 

KV  — 

GAMA=KK*P  means  y = 

where 

KK  — Y 

F ourth  line; 

POW  — P 

o 

OMGA  — * (i) 

PRES  — p 

Fifth  line; 

N(0)  — initial  electron  density 

NO  — N 

o 

NMAX  maximum  electron  density 
In  the  remaining  lines: 

2 

wz/v  - ^ 

EB/P  -►  E^/p  where  Ej^  is  effective  cw  breakdown  field  strength 
ENER  -►  integral  of  final  power  density 

The  different  sequences  display  how  long  pulses  are  attenuated  into  short  pulses 
via  electron- neutral  collisional  mechanisms  for  different  initial  peak  powers,  pulse 
lengths,  atmospheric  pressures.  It  should  be  emphasized  that  these  are  preliminary 
and  approximate  results;  further  analysis  is  continuing. 
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Pulse  propagation  through  atmosphere.  Series  showing  dependence  on  pulse  width 
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A PHENOMENOLOGICAL  DESCRIPTION  OF  INTENSITY  DEPENDENT  REFLECTION 
N.  Solimene  and  M.C.  Newstein 

The  Drude  theory  of  the  optical  properties  of  metals  is  based  on  a free  electron 
model.  Insofar  as  inter-band  transitions,  ion-core  polarization  effects,  band  structure 
dependence,  electron- electron  cor  relations , non- spherical  Fermi- surfaces,  etc.,  may 
be  ignored,  this  theory  yields  the  complex  dielectric  constant  as  a function  of  the  fre- 
quency u> 

2 

u 

*(<*>)-!  oj  (u)  + i v) 

where 

2 ^ Ne^ 

cm 
^ o 

e = electron  change 
m = electron  mass 
N = electron  number  density 
= permittivity  of  free  space 

V = electron  collision  frequency 

The  optical  properties  are  then  given  by  the  complex  index  of  refraction  n + ik  where 


k = 


2 


|l/2 

,1/2 


with  e 


= c^  particular  the  Fresnel  reflection  coefficient  is  given  by 

(n-1)^  +k^ 

” 2 2 
(n  +1)  + k 


Thus  the  metal  parameters  which  determine  the  optical  properties  are  the  effective 
electron  number  density,  the  effective  electron  mass  and  the  electron  collision  frequen- 
cy. For  many  metals,  reasonable  choices  for  these  parameters  may  be  made  and 
agreement  with  experiment  obtained  for  infrared  and  visible  frequencies. 

Within  the  framework  of  the  Drude  theory  the  temperature  dependence  of  the  optical 
properties  of  pure  simple  metals  is  primarily  via  the  temperature  dependence  of  the 
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collision  frequency.  The  collisions  in  questions  are  those  between  electrons  and  pho- 
nons. The  collision  frequency  increases  approximately  linearly  with  temperature 
between  room  temperature  and  the  melting  point.  Thus  the  collision  frequency  is  ex- 
pected to  increase  by  a factor  of  roughly  five.  Typical  values  at  room  temperature  are 

— = 0.1  ' 

U) 

P 

— = 0.01 

CO 

p 

and  R is  then  higher  than  0.9  between  room  temperature  and  melting.  This  is  in  rough 

agreement  with  experiment  at  low  light  intensities.  However,  measurements  using 

8 2 

laser  radiation  at  intensities  of  ~10  W/cm  indicate  reflectivity  decreases  to  values  as  low 
as  0.5.  The  temperature  dependent  Fresnel  reflection  coefficient  can  not  account  for 
this  and  is  moreover  not  applicable  since  only  a thin  surface  layer  is  heated. 

An  attempt  to  account  phenomenologically  for  the  laser  reflection  coefficients  may 
be  based  on  an  intensity  dependent  dielectric  constant.  Within  the  framework  of  the 
Drude  theory  one  may  introduce  intensity  dependence  by  modifying  the  plasma  frequency 
so  that 

2 

2 ‘^DO 

“p  1 + o I 

where  I is  the  intensity.  Thus  the  metal  becomes  less  overdense  and  the  penetration  of 
electromagnetic  radiation  more  effective.  Qualitative  mechanisms  for  such  an  effect 
might  involve  a metal  to  non-metal  transition  mediated  by  drastic  modifications  of  the 
electronic  band  structure  in  the  presence  of  intense  radiation  (e.g.  , high  frequency 
Stark  effects).  This  however  will  not  greatly  increase  the  absorption  of  radiation,  but 
merely  distribute  the  reflection  throughout  the  affected  layer.  A second  possibility  is 
to  modify  the  collision  frequency  so  that 

V = v^(l  + p I) 

Mechanisms  for  such  an  effect  might  include  excitations  of  plasmons  via  multi-photon 
processes  and  ion-core  excitations  with  results  similar  to  impurity  center  scattering. 
Without  speculating  at  this  time  on  the  justification  of  the  assumed  intensity  dependence, 
it  is  interesting  to  evaluate  the  consequences  of  these  assumptions. 

The  model  used  assumes  plane  wave  electromagnetic  radiation  incident  on  a con- 
stant density  metal  extending  from  z = 0 to  z = +oo.  In  normalized  form,  the  model 
equations  are 
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dE  OH 


3 V 


Inside  the  metal  we  may  assume 
E = Re  e 

y 

H = Re 

X 


V = Re  V e 

y 


■iwt 


where  the  envelope  functions  £ , -ff  and  v are  functions  of  z and  very  nearly  independent 
of  t.  We  then  have 

dJ(f  - 

— = - 1 w - V 

dz 

-i£ 

V = — 

OJ  + 1 V 

Therefore  eliminating  v,  the  second  of  the  above  equations  becomes 
dz 

where 


E = 1 - 


1 


cj  (co  + i v) 


Note  that  here  lo  and  v are  normalized  and  cj  =1.  When  the  admittance 

P 

^ ~ S 

is  introduced,  we  have 

dY  . . 

— =-ico(Y  -E) 


We  note  that 
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and  solve  for  Y(z)  in  the  interval  0 < z < oo . The  amplitude  reflection  coefficient  is 
then  given  by 


1 - Y(o) 
P ■ 1 + Y(o) 


and  the  intensity,  reflection  coefficient  is 


R=  IpI' 

The  electric  field  amplitude  inside  the  metal  is  given  by 


z 

i(z)  = »?(o+)exp(+  iu)  j Y(z')dz') 

o 

where 

J(o+)  = (1+ 

and  is  the  incident  field  amplitude  which  is  prescribed. 

Typical  results  are  summarized  in  Figs.  1 and  2 for.  p - 1,000  and  5,000  respec- 

z z 

tively.  Figure  1(a)  shows  the  incident  I and  the  transmitted  intensity  |<f  (o  +)  ( 

as  well  as  the  reflection  coefficient  as  functions  of  time.  Figure  1(b)  shows  the  intensity 
),f  (z)  I for  the  peak  value  of  R is  to  be  noted  that,  for  this  value  of  p,  the  reflec- 

tion coefficient  is  appreciably  changed,  but  the  field  intensity  inside  the  metal  is  mono- 
tonically  decreasing.  In  Fig.  2 the  corresponding  quantities  are  shown  for  p = 5,000. 
Notice  that  the  reflection  coefficient  is  drastically  reduced  and  the  intensity  inside  the 
metal  shows  a standing  wave  pattern  and  indeed  I -/ (o  +)  | is  greater  than  I • A 

layer  of  the  metal  has  in  effect  become  transparent. 

Further  work  on  this  model  to  include  thermal  effects  and  a more  careful  model- 
ling of  the  metal  may  be  carried  out. 


National  Science  Foundation 
ENG76-21829 


N.  Solimene  and  M.C.  Newstein 


INTENSITY 


WAVE-MATTER  INTERACTIONS 


203 


w=0.  1 v=0.01  KB  = 2 

a = 0 0 = 1000 


(a) 


u = 0.  1 v=0.0I  KB  = 2 

a = 0 B = 1000 

R = 0.  538347 


DISTANCE 

(b) 

Fig.  1,  (a)  Temporal  dependence  and  (b)  Interior  spatial 

dependence  for  moderate  coupling  case. 
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A THEORETICAL  STUDY  OF  INJECTION  TUNING  OF  OPTICAL  PARAMETRIC 
OSCILLATORS 

E.S.  Cassedy  and  M.  Jain 


Injection  tuning  of  pulsed  optical  parametric  oscillators  (OPO)  has  been  demon- 
strated experimentally  by  Bjorkholm  and  Danielmeyer*  in  1969.  This  technique  makes 
it  possible  to  control  the  frequency  of  a high-power,  pulsed  OPO  by  injection  of  low- 
powered  radiation  from  a frequency  controlled  source  (e.  g.  . an  LED).  Since  thereby 
the  frequency  of  the  OPO  can  be  finely  controlled  (<  1 cm  in  the  IR  range),  the 
technique  is  especially  promising  for  photochemistry  applications  where  power  and 
frequency  control  are  important. 

In  order  to  design  pulsed  OPOs  which  are  injection  tuned  for  single -mode,  adjust- 
able-frequency output,  the  time-dynamic  behavior  of  the  entire  coupled,  multimode 
OPO  system  must  be  understood.  Bjorkholm  and  Danielmeyer^  gave  a good  qualitative 
description  of  the  time-dynamic  behavior  of  successful  injection  operation.  For  design 
criteria,  however,  a more  quantitative  understanding  is  needed.  In  the  present  work 
we  have  conducted  a computer  study  of  the  equations  modelling  the  multimode  para- 

2 

metric  oscillator.  The  equations  are  of  the  type  formulated  by  Yarivand  Louiselle, 
originally  for  assumed  single-mode  operation  of  the  OPO.  Here  we  have  generalized 
these  equations  to  account  for  coupling  into  any  of  the  several  resonator  modes  ex- 
periencing parametric  gain.  These  generalized  equations  are  as  follows: 


da 


j. lU  a - a 

dt  p p 2QI  p 


da 


2n-l _ . , 

dt  ~ ^ 2n-1^2n-l 


n=l 
u 


a,,  a,  . + i k e 
2n  2n-l  p 


- iwt 


2n- 1 

2Q^  ^2n-l  ' ‘^'n“2n“p 


+ K a_  a 


(1) 

(2) 


ui 


a, 

2n  2n 


2n  * 

a,  + K a- 
2Q.  2n  n 2n- 


l^P 


(3) 


where  the  (complex)  functions  are: 

a = a (t)  = pump  mode  amplitude 
P P 

a^^  j = a^j^  j(t)  = signal  mode  amplitude  of  the  n*^  mode 
a^^  = ~ idler  mode  amplitude  for  the  n^^  mode 

with 

j = resonant  frequency  of  the  n^^  signal  mode 

= frequency  of  the  idler  mode, 
satisfying:  Wp  = 
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= cavity  Q factor  of  signal  modes 
Qj^  = cavity  Q factor  of  idler  modes 

X = pump-source  term;  proportional  to  the  incident  pump  field  tangent  to  the 
P 3 

cavity  boundary 

= coupling  factor  of  n*'^  signal  mode 


= 4-  Ju  U_  U),  7 

2 y p 2n  2n-l 


d 

,3/2^  1/2 


. 2,  AkL, 
sin  — ) 


AkL 


) 


d = nonlinear  susceptibility  factor 
€ = dielectric  constant  (assumed  scalar) 

V = volume  of  the  nonlinear  crystal 

til 

Ak  = phase  match  error  in  wave -number  for  the  n mode  frequency 
L = length  of  the  nonlinear  crystal 
N = total  number  of  modes 

In  the  model  it  is  possible  to  consider  any  number  of  modes  from  N = 1 ( the 
single-mode  assumption  of  Yariv  and  Louiselle)  up  to  the  value  of  N corresponding 

to  the  full  number  under  the  parametric  gain  curve:  sin^(  • determined 

by  the  phase  match  characteristics  of  a particular  nonlinear  crystal  (oriented  along  a 
particular  axis)  and  the  frequency  spacing  of  the  cavity  mode  resonances.  The  total 
number  of  equations  required  to  model  N modes  is  2N  + 1. 

In  the  computer  program  the  coupling  constants  have  been  normalized  to  that 


(K  ) of  the  mode  with  maximum  gain  and  the  normalized  coupling  constant  of  the  n 
' max'  ® 


th 


(non -maximum  gain)  mode  is 
ized  to  the  threshold  value;* 


The  mode  amplitudes  were  all  normal- 


Pth 


^1^2 

2K 

max 


where 


^1  = q: 


(4) 


"'2  = 0: 


which  is  the  pump  mode  amplitude  (A  = |a  I)  at  the  threshold  of  growth  for  a single- 
2 P P 

mode  case.  The  time  scale  was  normalized  to  the  decay  time  of  the  pump  mode: 


T = - 


(5) 
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Using  these  normalizations  in  the  single-mode  (N=l)  case,  the  final  steady-state  of  the 

da  da^  da^ 

equations  (following  Yariv  and  Louiselle)  for  = 0 results  in  the 

amplitudes: 


lapi  =1 


where 


= 


1/2 


1/2 

) 


= \ /\ 

P Pth 


= the  pump  driving  term  normalized  to  the  threshold  driving 


magnitude: 


\ 

Pth 


^1^2^ 

4K 


and 


«1  = Yj/V. 


"2  = 


The  Runge  Kutta  method  of  numerical  integration  of  differential  equations  was 
used  to  solve  the  equations.  High  accuracy  solutions  were  achieved  on  a digital  com- 
puter for  normalized  time  increments  of  AT  = 0.  1 for  all  mode-number  cases  tried, 
including  N = 13  (27  complex  equations).  On  Fig.  1 are  shown  solutions  where  the 
pump  driving  amplitude  is  twice  the  threshold  value  (i.  e.  P = 2)  for  the  simple  N = 1 
case  (no  injection).  The  damping  constants  correspond  to  a singly -re sonant  oscillator 
(SRO) . with  Qj  = 0.  1,  = 1*0!  that  is,  only  the  a^  mode  (to  be  termed  the  "signal" 

henceforth)  has  a decay  constant  smaller  than  the  pump.  The  pump-mode  amplitude 
(la^l ) has  been  given  an  initial  (T  = 0)  value  of  zero  and  the  signal  and  idler  mode 
amplitudes  have  been  given  small  initial  values  to  correspond  to  the  noise  level. 


10-^- 

Fig.  1.  OPO  mode  time  dynamics  (non -injection  case) 
Parameters;  aj=0.  1,  a^=i,0,  P=2 


r 
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In  this  figure  three  regimes  are  discernible.  In  the  first  regime,  the  pump-mode 


I sip  I = P.  and  therefore  on 


amplitude  (ap)  grows  from  zero  to  its  undepleted  value: 

Fig.  1 it  reaches  [a  | = 2.  Next,  noting  that  on  the  vertical  scale  10^=  1 (i.  e.  , the 
normalized  threshold  level),  we  see  that  growth  of  a^  and  a^  commences  as  soon 
as  ap  exceeds  unity.  Once  [spl  = p,  the  growth  rate  of  a^  (and  a^)  is  just  as 
calculated  from  the  "undepleted-  pump"  theory,  i.  e. , 


**2  (normalized) 


Finally,  the  third  and  final  (steady-state)  regime  is  reached  as  the  signal  and  idler 

amplitudes  become  comparable  to  the  pump-mode  amplitude  and  depletion  occurs. 

Note  that  the  final,  steady-state,  pump  amplitude  is  | a | = 1 , i.  e.  , it  settles  finally 

4 ^ 

at  the  threshold  level.  It  can  be  shown  that  the  final,  steady-state  solutions  are 
unique  for  a specified  pump  source  (^p>  amplitude  and  phase),  being  independent  of  the 
phases  of  the  initial  values  of  the  (small -amplitude)  signal  (aj)  and  idler  (a^)  modes. 

In  order  to  illustrate  injection  operation  we  show  on  Fig.  2 the  results  of  calcu- 
lations with  one  additional  mode  (a^)  arbitrarily  added  (the  accompanying  idler  modes 
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have  not  been  plotted).  In  Fig.  2 the  non-maximunn  gain  mode  (a,)  has  an 
^ 1 

initial  value  of  ^fz~  x 10*  , due  to  assumed  injection,  where  as  the  maximum  gain 


a 2 and  a^ 


-5  u 

mode  (aj)  has  an  initial  value  of  /Z  x 10  to  represent  the  noise  level.  We  see  that 

a quasi-steady-state  is  established  during  the  period  T = 12-27,  during  which  the  pump 

mode  amplitude  has  been  depleted  to  the  (normalized)  value  (Upl  = — = 1.  67)  by  the 


injected  mode  (a^)  which  dominates  in  this  regime.  Subsequently,  the  fastest  growing 
mode  (aj)  attains  magnitudes  sufficient  to  further  deplete  the  pump  and  the  final  steady- 

state  (a  = 1)  is  reached  essentially  by  T = 50.  The  a,  mode  has  stated  to  decay,  once 

P 11^ 

the  pump  amplitude  is  depleted  below  its  {|ap)  = 1. 67)  threshold  value  (for  T > 27) 

and  continues  to  decay  thereafter  to  insignificant  magnitudes. 


The  mode  dynamics  shown  on  Fig.  2,  represents  "successful"  injection  operation, 
in  the  sense  that  an  essentially -pure*  frequency  of  could  be  expected  out  of  such  an 
oscillator  over  a (normalized)  time  period  of  about  15.  Such  an  output  would  be  shifted 
from  the  Uj  frequency  which  would  be  emitted  in  the  absence  of  injection  at  or 

which  is  emitted  in  the  final  steady-state.  Some  OPO  parameters  do  not  lead  to 
successful  injection  operation,  however,  an  example  of  which  is  shown  on  Figure  3. 


* 


la|f 


®2  ^ “6  ~ * 

ilj  = rjj  = 0.  637 


0 

r,2  = 1.0 


We  ignore  the  idler  frequencies,  since  they  are  in  another  wavelength  band  for 
typical  OPOs. 
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Here,  an  N = 3 case,  the  Injection  is  on  mode  a^,  which  is  a non  maximum  gain  mode 

(along  with  aj).  The  mode  labeled  a^  is  here  the  maximum  gain  mode  and  may  be 

seen  to  overtake  the  injected  mode  before  a quasi-steady-state  can  be  established.  ; 

As  a result  the  pump  becomes  depleted  to  the  |a  | = 1 resulting  in  the  decay  of  mode 

P 4 

a_  and  "unsuccessful"  injection.  Further  calculations  show  that  successful  injection 

D 

can  be  achieved  for  the  oscillator  parameters  of  Fig.  3 if  the  pump  source*  is  in- 
creased. Such  a prescription  for  successful  injection  was  not,  however,  found  ef- 
fective in  any  case  where  injection  was  attempted  on  a mode  having  a relative  cou- 
pling constant  (H  less  than  0.  6. 

On  Fig.  4 is  shown  a case  of  successful  injection,  where  the  injected  mode  (a__) 

cb 

has  a relative  coupling  constant  1 23“®*  ^3  and  the  (relative)  pump  strength  is  ^ = 3. 

,ali 


z'  ' A 
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Fig.  4. 


OPO  mode  time  dynamics  (successful  injection). 
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The  pump  source  parameter  ( X.  ) corresponds  to  the  field  strength  of  the  (external) 
pumping  laser.  ^ 
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A quasi-steady-state  is  seen  to  be  established  for  over  50  time  units.  This  figure  also 
illustrates  the  (small)  influence  of  a large  number  of  off-peak-gain  modes,  as  would  be 
present  in  an  actual  OPO.  The  curves  labelled  with  two  modes  (e.  g.  , the 
curve  second  from  the  top  on  the  right)  represent  modes  whose  frequencies  are  dis- 
posed symmetrically  either  side  of  the  maximum  gain  mode,  and  thus  these  modes 
have  identical  gain  characteristics.  In  any  case,  these  modes,  if  not  injected,  con- 
tribute only  to  a minor  extent  to  the  depletion  dynamics  and  all  decay  once  the  peak- 
gain  mode  depletes  the  pump  mode. 


Throughout  these  calculations  one  parameter  stands  out  as  a consistent  measure 
of  success  for  injection  tuning.  That  parameter  is  i1  = ^injgct^^max’  coupling 
constant  of  the  injected  mode  relative  to  that  of  the  highest-gain  mode.  Several  in- 
jection criteria  can  be  set  down: 

a)  if  rj  < 0.  6.  then  successful  injection  cannot  be  expected 

b)  an  increase  in  pump  strength  (i.  e.  , P » 1)  will  not  result  in  successful 

injection,  when  ^ 

c)  if  q > 0.  6,  an  increase  in  pump  strength  can  improve  injection 

d)  for  low  pump  strength  ratios  (e.  g.  , P = 1.5),  coupling  constants 
considerably  higher  than  0.  6 (e.  g.  , h > 0.  85)  are  required  for  successful 
injection 

e)  an  increase  in  cavity  Q or  cavity-reflectance  appears  to  improve  injection, 
when  the  operation  is  already  at  least  in  the  marginal  success  range. 

The  r)  parameter  is  not  a directly  measurable  quantity  for  the  experimenter.  It  none- 
theless can  be  estimated  from  the  gain  taper  once  the  total  bandwidth  of  the  OPO  cavity 

4 

is  known,  by  a method  used  in  this  study.  The  results  get  translated  into  the  number 

.4 

of  modes  "off  center"  which  can  be  used  for  injection  tuning. 
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PARAMETRIC  EXCITATION  OF  COUPLED  PLASMA  WAVES 
B.  R.  Cheo,  C.  Hechtman,  S.  P.  Kuo,  T.  Q.  Yip 

A.  Introduction 

Parametric  decay  instabilities  in  plasmas  have  been  studied  extensively.  In  one 
recent  review  article  by  Porkolab,  over  one  hundred  references  are  cited.  For  non- 

2 

magneto  plasmas,  the  theoretical  efforts  are  typically  represented  by  that  of  Nishikawa 
whose  basic  approach  is  to  use  the  hydrodynamic  equations  to  derive  the  coupled  mode 
equations.  For  magneto  plasma,  the  method  generally  used  is  a transformation  by 

3 4 

Aliev  et  al.  on  the  Vlasov  equation  as  shown  by  the  work  of  Porkolab.  In  this  case, 

the  quasi  static  approximation  E = - V $ has  been  used  and  hence  valid  for  longitudinal 

waves.  In  this  report  we  present  a general  analysis  based  on  the  Hamiltonians  of  the 

decay  waves.  The  approach  is  valid  for  all  wave  types.  In  the  limiting  case  of  B 0 

2 ° 

the  result  is  reduced  to  that  of  Nishikawa.  In  the  limit  of  longitudinal  waves  the  re- 
1 4 

suits  of  Porkolab  ’ are  recovered. 


B.  Coupled  Mode  Equations 

We  consider  two  normal  modes  designated  by  S and  L (e.g.  , a phonon  and  a 
plasmon)  with  frequencies  ojJ  and  in  the  linear  regime,  wave  numbers  ± k (uniform 
pump  field  assumed),  and  reduced  masses  for  the  two  species  M and  m.  They  can  be 
taken  as  ion  and  electron  masses  if  S is  dominated  by  ions  and  L by  electrons.  The 
unperturbed  Hamiltonian  density  of  each  mode  is  given  by 


• i S ik"  (y/M  + M ^ q"  ®) 

o 

”!)?  = I Tj  (k)) 


o = 1,  2,  3 Cartesian 
components 


(1) 


where  K,  Q,  U are  respectively  the  canonical  momentum  and  displacement  variables. 
Assuming  a static  magnetic  field  B = B^z,  these  variables  are  normalized  to  satisfy  the 
Poisson  bracket  relations: 

{Q^(k),  Kp(k')}=  6(k.  k')6^  p and  [UJik),  p(k')}=  6(k,  k')  6^  ^ 
and  are  related  to  the  modal  fields  as; 


E (k)  = Q(k) ^ K(M  X i 

® e Jli’  e Jn 

^ o ^ o 

(2) 

7 ^ 

E (k)  = (k)  U (k)  + — ^(k)  X 2 

e J~n~  e ^n” 

o ^ o 
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where  and  are  the  actual  (shifted)  oscillating  frequencies  of  the  respective  modes, 
n and  n.  are  cyclotron  frequencies  for  masses  m and  M with  the  appropriate  sign  of 
the  charges,  and  n^  is  the  background  density.  When  the  plasma  is  pumped  by  an  ex- 
ternal field  E = E^  + = 2E^q  sinu;^t  + ZE^q  cosu^t,  coupling  of  the  modes  will  take 

place.  The  Hamiltonians  will  be  perturbed  through  the  modulation  of  the  electrorestric - 
tive  polarizability  tensor  a.^.  In  the  first  order  Taylor's  series  expansion,  we  have 
(Einstein  summation  convention  is  understood): 


ba. . 

O , 11 

a. . - a. . + , — — 

ij  ij  vdE^ 


so 


ba. . 

^sc  ' SE^  o 
o i a o 


(3) 


Keeping  terms  of  frequencies  and  the  induced  polarization  due  to  the  pump  is 

expressed  as 


P. 

1 


a..  E . 
ij  PJ 


a°  E . + 
ij  PJ 


ba. . . 

V bE  ^ a ^pj 

s o o 


ba. . 
JJL 


f o 


(4) 


Because  of  the  conservation  of  frequencies  ~ second  term  may  be  desig- 

nated by  6Pjji  and  the  third  by  6P  . . The  interaction  Hamiltonians  of  S and  L are  thus 

= -6P(k)E  .(k)  and  h'/^^=  -6p1^^  E.:(k).  with: 
s SI  si  — JE  jti  ri  — 


6P 


(k)  _ 


SI 


ba. . 

- I E"^  (k)  E".  + c.c.  = 6P^.  + c.c. 

+ PJ  SI 


'^E]^{k)  'o 


E‘  E"*".  + c.c.  = bP't.  + c.c. 

^bE;^(k)  ' PJ 


(5) 


The  superscripts  ”+’  here  and  elsewhere  designate  recpectively  the  exp(+iwt)  compo- 
nents of  the  oscillating  fields.  The  total  Hamiltonian  of  each  mode  is  then  the  sum  of 
the  unperturbed  and  interaction  Hamiltonians  H = H^  + H^.  It  is  known  that  the  equa- 
tion of  motion  for  any  canonical  variable  A is  A + ZFA  + F A = {{A,  H},  H},  where 
r’^  is  the  phenomenological  relaxation  time.  We  can  apply  this  to  derive  the  equations 
of  motion  for  the  four  variables.  Using  Eq.  (2),  we  then  obtain  the  coupled  mode  equa- 
tions: 


+ 

eJ  (y  + 


i;  + oi2(k)E;  = 

o 

2r.  Et  + eJ  = -2^  6P+(k)+ 

III  n e 


uj^(y  nj^(2  x6P‘(y  ^ 

0^(2  X 6P^(k))x£ 


(6) 
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where  w?  = + F?  the  shifted  oscillating  frequencies.  It  is  seen  that  6 P on 

XyS  XyS  X}S 

the  RHS  of  Eq.  (6)  provides  the  coupling  between  S and  L as  shown  in  Equation  (5), 

C.  Coupling  Coefficients 


To  obtain  the  coupling  coefficients 
insider  the 
Vlasov  equation 


da. 


3E  . 

we  consider  the  distribution  function  species  a which  satisfies  the 


J — we  consider  the  distribution  function 


c - . 

3^f  +v  • 3r  f + E + — V X B • 3v  f_=  0 

ta— a —a  a c— a — o,  -aa 


(7) 


Using  a procedure  similar  to  that  of  Aliev^et  al.  , ^ we  first  transform  Eq.  (7)  to  the 
oscillating  frame  of  reference:  v = ^ j dt^R^(t-t^)  • Ep(t  ) and 

r = r / dt'  f dt"R_(t'-t")  . E (t")  where  R(t)  = (3«+ yy)  cos  Si  t + 

- "’a  =a  -p  =a  a 

(3^  -yx)  sin  + zz  (S2  = e^  Bym^c)  and  E^lt)  is  the  pump  field.  Let  = 

and  E = E + 6 E with  f^  ' being  a spatially  uniform  function,  | <<  jf^^  ] and 

16E1  <<  lEpj.  Collecting  zeroth  and  first  order  terms  of  the  transformed  Eq.  (7), 
we  have 


^a-  ^ ^ 


a t)  = 0 

V a ' 


(0) 


and 


(3^  + V • 3^  + 


a ) ^ f,E  ■ 3 f 

v'  m^^  — V a 


(o) 


(9) 


Assuming  that  f 


(o)  _ 


ru. 


3/2  , 2 

a , 2nT  '"ll^  ^ Maxwellian  in  v,  and 

that  the  spatial  dependence  of  and  6 E take  the  form  exp(ik  • £(j)«  we  can  int-.grate 
Eq.  (9)  along  an  unperturbed  trajectory:  r = v,  v = S2^  v x z.  Along  such  trajectories 
v(t')  = R(t'-t)  • v(t)  and  r(t')  = r(t)  + S2‘^  t)  • v(t).  where  L^(t)  = Jq 

Along  this  trajectory  the  LHS  of  Eq.  (9)  becomes  an  exact  differential.  Integrating  the 
RHS  along  the  trajectory  thus  yields  The  total  induced  current,  and  hence  the  po- 
larization, can  be  then  calculated  from  J = E simpUcity,  we  choose 

the  wave  vector  k to  be  in  the  x-z  plane  k = k^  x + k,,  z,  the  integration  can  be  carried 
out  through  the  transformation  of  the  variables  in  the  (r,  v)  frame,  yielding  the  total 
dielectric  tensor.  The  linear  part  can  be  shown  to  agree  with  the  well  established 
form?  The  nonlinear  part  can  be  best  described  by  the  induced  polarization  6^  and 
6?"^  . After  much  algebra^’ ^ we  obtain 
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D.  Comparison  with  Existing  Results 

Because  only  longitudinal  waves  are  considered,  we  can  set  E„  = ik  $ and  E^  = 

“*  S ^ S “ X 

-ik®^  , and  rewrite  Eq.  (6)  as 


$ + 2r  $ +co  ^ 

S S S S £ 


^ ^ 6P'{k))x£  j (12) 


nek 
o 


nek 
o 


We  consider  two  cases  when  comparison  can  be  made;  (I)  = 0 or  the  pump  wave  is 

the  ordinary  mode  with  decay  waves:  Langmuir  and  ion  acoustic  waves;  and  (2)  pump 
wave  is  the  extraordinary  mode,  |n  |»  copg,  with  decay  waves:  upper  hybrid  and 
electrostatic  ion  cyclotron  waves,  (m/M)  ' < k„/k^<<  1;  or  upper  hybrid  and  low  hy- 

brid waves,  k„/k^«  (m/M)^^^:  (1)  Let  E^^lt)  = ZE^q  cos  w^t  z = E^  and  k=  kz.  Sub- 
stitution of  these  in  Eqs.  (10)  to  (13)  yields 

. 2 

» . y . k e cj  , 

$■  -I-  2r  -i  u $'  = i E" 

S SS  SS  ^Jtp 

mu  ^ 


2 -i  .ke  d--_+ 
f i.  I I I m.Zsp 


(14) 


This  is  the  well  known  result  of  Nishikawa.  (2)  In  this  case  we  write  E = 2E 

”f>  o 

(xcos  u^t  - y sin  u^t)  and  k = k^  x -H  k„  z and  substitution  of  these  in  Eqs.  (1)  and  (11) 
yields 

!:  ■ sp;;!.,  •>  = - ^ s 

CO 

O 


k • 6Pt(k,  t)  = ^ ^ 1 -h  X (“  ) k E $■ 

— — i — 4tt  m 2 ..  s i o s 
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where  we  have  neglected  the  contribution  from  ion  terms,  and 
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substituting  Eq.  (15)  in  Eqs.  (12)  and  (13).  We  have 


o . -iMw^(w^-n^)  ^ 2E^  , 
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the  threshold  field  can  be  calculated  from  the  following: 


2E 
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14 

This  is  in  agreement  with  Porkolab,  ’ It  is  interesting  to  point  out  that  if  for 
case  (2)  a linearly  polarized  pump  is  used  with  UH  and  LH  as  decay  waves,  the  thres- 
hold field  is  reduced^’ ^ by  a large  factor  (w.  /w  )^/^! 
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EVOLUTION  OF  TRANSVERSE  INSTABILITY  IN  A HOLLOW  CYLINDRICAL  WEAKLY 
IONIZED  PLASMA  COLUMN 

H.  Kudyan  and  K.  Chung 

The  fornnation  of  striations  in  our  electrodeless  ECRH  device  was  reported  previ- 
ously, and  the  possibility  was  pointed  out  that  a plasma  instability  rather  than  the 
mechanism  of  electrodes  might  be  the  cause  of  the  filamentary  structures.  The  experi- 
ment was  performed  on  the  setup  described  previously.^ 

Cur  investigations  of  the  background  plasma  of  various  types  of  plasma  columns 
indicate  that  the  plasma  generated  in  our  machine  has  significant  flow  fields  both  in  the 
azimuthal  and  axial  directions,  and  that  hollow  cylindrical  plasma  columns  would  develop 
into  stiiations.  Invariably,  before  breaking  up  into  striations,  strong  localized  oscil- 
lations which  propagate  azimuthally  with  mode  numbers  m = 3 and  4 are  observed.  The 
frequency  of  these  oscillations  (20  to  100  KHz)  varies  with  the  magnitude  of  the  radial 
electric  field  but  is  virtually  independent  of  ion  masses.  Although  apparent  character- 
istics of  these  oscillations  are  similar  to  edge  oscillations  (transverse  Kelvin- Helmholty 
instability)  observed  in  cylindrical  alkali  plasmas^  and  rotational  velocity  plays  a 

major  role,  our  plasma  is  of  hollow  cylindrical  shape  and  synchronous  effects  of  surface 
charge  waves  are  important  in  unstabilizing  the  hollow  plasma  column.  As  the  oscilla- 
tions become  stronger,  a distorted  steady  state  is  reached  (i.e.,  striated  plasma  column). 

Based  on  our  findings  from  the  background  measurements,  a small- signal  model 
is  developed  which  accounts  for  the  observed  transverse  instability.  The  dispersion 
relation  obtained  in  this  manner  is  given  by  Eq.  (1),  and  plotted  in  Figure  1.  This 

4n^-2D[m(l--^)  + (^)^*"-  (|)^'"]-  [(1-  (f)^"^)(l-  (^)^"^)- 
c 

- m(l- ■^)(1-  (|)^"^)]  = 0 (1) 

c 

D = u)  / CO  , CO  = e(N  - N.  )/e  B 

is  similar  to  that  for  the  diocotron  instability^  which  is  well-known  in  the  case  of  un- 
neutralized thin  electron  beams  in  an  axial  magnetic  field.  The  roots  of  the  dispersion 
relation  are  independent  of  the  neutral  pressure  because  our  model  neglects  collisional 
viscosity.  Experimentally,  the  unstable  oscillations  persist  up  to  the  micron  range  of 
neutral  pressures.  Nevertheless,  the  measured  values  of  both  frequency  and  growth 
rate  exhibit  a barely  noticeable  decrease  with  increasing  neutral  pressure. 

The  instability  has  a distinct  effect  on  the  configuration  in  our  plasma.  This  is 
manifested  as  the  formation  of  striations.  An  understanding  of  the  phenomena  can  be 
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Fig.  1.  The  real  and  positive  imaginary  parts  of  the 
roots  of  the  dispersion  relation  for  m = 3 to  10 
as  a function  of  b (c  = 0 . 05  m,  d = 0 . 1 m) . 

formulated  by  treating  the  hollow  column  plasma  to  be  composed  of  two  individual  sys- 
tems: the  boundaries  which  can  support  unstable  surface  charge  waves,  and  the  bulk  of 
the  plasma.  The  boundary  system  is  taken  to  be  always  uniform  in  the  azimuthal  direc- 
tion while  the  internal  layers  of  charge  are  assumed  to  be  always  uniform  in  the  radial 
direction.  These  two  "systems”  communicate  solely  via  the  electrostatic  fields.  As 
the  fields  of  the  two  interacting  surface  charge  waves  grows,  the  internal  layers  of 
charge  (which  are  in  synchronism  with  the  imposed  wave  field  due  to  their  local  velocity) 
respond  to  it  by  bunching.  The  bunching  in  turn  transmits  a field  which  is  not  in  phase 
with  the  imposed  fields  and  makes  the  two  surface  waves  progressively  asynchronous. 
Eventually,  the  two  interacting  surface  waves  lose  syncrhonism  and  become  marginally 
stable.  The  bunching  itself  cannot  go  on  indefinitely  because  the  internal  layers  "see" 
a progressively  increasing  average  Doppler  frequency.  Eventually  evolution  halts  and  a 
distorted  steady  state  is  reached. 

The  model  explained  above  can  be  represented  by  a two  dimensional  mathematical 
model,  which  a priori  assumes  the  presence  of  the  field  generated  by  the  surface  charge 
waves  (a  consequence  of  the  radial  inhomogeneity).  The  set  of  nonlinear  equations 
scaled  to  one  linear  growth  time  and  one  perimeter  of  synchronous  layer  is: 
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f(y.T)  is  the  normalized  charge  density  and 

I^{t)  is  the  normalized  electric  field  intensity  of  each  unstable  surface  wave. 

These  equations  are  followed  in  time  numerically  using  the  IGI  (Interactive  Graphic  Inter- 
pretive) language.  The  outcome  of  these  computations  indicate  that  the  initially  uniform 
configuration  evolves  in  time,  as  seen  in  Figs.  2 and  3,  toward  a distorted  steady  state. 
In  our  experimental  investigations  of  the  conditions  leading  to  striations,  we  have  noticed 
that  with  very  thin  hollow  columns  (i.e.  , several  unstable  surface  modes)  it  was  usually 
hard  to  induce  uniformly  distributed  striations  around  the  column  perimeter;  instead  a 
single  or  several  striations  were  noticed  to  exist  asymmetrically.  With  relatively 
thicker  hollow  columns  (i.e.,  one  unstable  surface  mode),  the  striations  were  noticed 
to  occur  rather  symmetrically  around  the  column  perimeter.  Our  nonlinear  model  does 
exhibit  this  characteristic  behavior  of  the  striations.  Figure  2 corresponds  to  an  initial 
column  geometry  where  only  the  m = 3 surface  mode  is  unstable;  Fig.  3 corresponds  to 
an  initial  column  geometry  where  m = 3 and  4 surface  modes  are  simultaneously  unstable. 
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lit,,  . Tho  computational  result  tor  the  case  v.iieie  /ilj  .;i  j is  unstable. 
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r IK.  3.  The  computational  result  for  the  case  where 
m = 3 and  m = 4 are  unstable. 
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RESPONSE  OF  MINOR  SPECIES  TO  GRAVITY  WAVES  IN  THE  THERMOSPHERE 
S.  H.  Gross  and  H.  Eun 

The  response  of  minor  species  to  gravity  waves  in  the  thermosphere  varies 
according  to  the  mass  of  the  species.  The  relative  density  perturbation  of  any  minor 
constituent  may  be  related  to  the  relative  density  perturbation  of  the  atmosphere  due 
to  the  waves  via  a complex  function  of  frequency  and  wave  number  that  may  be  repre- 
sented as  an  amplitude  and  phase  response.  Peaks  and  dips  in  response  and  large 
phase  shifts  are  found  that  are  associated  with  complex  poles  and  zeroes  of  the  re- 
sponse function.  These  poles  and  zeroes  depend  on  background  quantities,  so  that  the 
nature  of  the  response  is  model  dependent;  collisions  are  most  important.  The  effects 
of  background  and  collisions  are  examined  using  numerical  computations.  Relation- 
ship to  some  satellite  observations  are  discussed. 

A.  Introduction 

It  is  recently  appreciated  that  gravity  waves  may  play  an  important  role  in  the 
transport  of  energy  between  all  layers  of  the  atmosphere,  not  only  for  the  earth,  but 
for  all  the  planets  that  have  atmospheres.  The  role  is  quite  complicated  and  not  fully 
understood  as  yet. 

One  aspect  of  this  problem  is  the  effect  of  gravity  waves  on  the  various  con- 
stituents, not  least  of  which  is  ionization  because  of  detectability.  These  disturbances 
may  influence  refraction  of  radio  by  the  atmosphere  and  ionosphere  and  affect  com- 
munications as  well  as  radio  scientific  measurements.  The  disturbances  of  minor 
species  may  also  influence  their  chemical  reactions,  so  as  to  produce  special  effects 
of  greater  consequences  as,  for  example,  nitrogen  oxides  in  relationship  to  ozone. 
These  possibilities  deserve  further  investigation,  and  they  are  mentioned  here  in  order 
to  illustrate  the  relevance  of  the  topic  of  this  report. 

In  this  report  we  are  concerned  with  the  effects  of  gravity  waves  on  minor 
neutral  species  that  are  in  the  thermosphere.  These  individual  species  are  recently 
measurable  by  satellite.  Their  response  is  quite  complex  because  of  dependence  of 
the  mass  of  the  species  relative  to  the  mean  mass  of  the  atmosphere  at  a given  level 
and  on  the  dynamic  background  motion. 

Here  the  phase  and  amplitude  response  relative  to  the  main  atmosphere  will  be 
treated  for  particular  wave  frequencies  and  horizontal  wave  numbers  for  various  minor 
species,  such  as  argon,  helium  and  molecular  oxygen.  Examples  will  be  given  for 
various  models  for  different  times  of  the  day,  different  solar  activity  and  for  a typical 
altitude,  250km.  The  behavior  at  170km  will  also  be  described  for  contrast.  It  will 
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be  shown  that  large  phase  shifts  and  large  relative  amplitude  responses  are  possible  at 
250km  altitude  for  specific  frequencies  and  wave  numbers  for  some  species.  It  will 
also  be  shown  that  the  response  is  highly  dependent  on  the  magnitude  of  the  collision 
frequency  between  species.  Major  amphasis  will  be  on  long  period  (~  2 hours)  waves 
of  long  horizontal  wavelength  (~  lOOO's  km)  in  order  to  compare  with  satellite  measure- 
ments. 

Data  from  two  satellites  are  of  interest.  Reber  et  al.  ^ found  from  measurements 
on  board  the  Atmospheric  Explorer  Satellite,  AE-C,  that  the  helium  response  was 
consistently  180°  out  of  phase  with  nitrogen  and  argon  at  altitudes  from  160  km  to 
250  km.  The  response  of  argon  and  nitrogen  were  nearly  in  phase,  but  the  amplitude 
response  of  argon  was  roughly  four  times  that  of  helium  and  twice  that  of  nitrogen. 

This  response  was  evident  for  horizontal  wavelengths  between  100  km  and  400  km. 
though  much  longer  wavelengths  were  also  evident  in  the  data.  Trinks  and  Mayr^ 
found  evidence  of  similar  responses  in  the  data  from  the  European  Satellite  ESRO  4. 

At  about  240  km  altitude  behavior  was  found  to  be  about  171°  out  of  phase  with  nitrogen 
and  argon  which  were  in  phase.  The  amplitude  of  the  argon  response  was  found  to  be 
about  2.  7 times  that  of  helium  and  about  1.6  times  that  of  nitrogen.  The  horizontal 
wavelength  and  period  of  these  waves  were  found  to  be  3000  km  and  150  minutes, 
respectively.  It  is  with  such  waves  that  we  wish  to  make  theoretical  comparisons. 

B.  Theory 

The  basic  theory  was  developed  by  Gross  and  Eun^  for  two  fluids,  one  major  and 
one  minor  in  local  thermodynamic  equilibrium.  The  medium  is  assumed  to  be  plane 
stratified  and  the  minor  species  diffuses  in  the  major  according  to  its  background 
distribution.  The  perturbation  equations  are  given  in  matrix  form  as  follows: 


Here  e is  a column  matrix  of  4 elements,  Cj^  is  the  vertical  component  of  the 
perturbation  velocity  of  the  main  atmosphere,  Pj/p^  is  the  relative  pressure 
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perturbation  of  the  main  atmosphere,  C , is  the  vertical  component  of  the  perturba- 
tion  peculiar  velocity  of  the  minor  species  and  P^j/ is  the  relative  pressure  per- 
turbation of  the  minor  species.  Subscript  o refers  to  background  quantities,  whereas 
subscript  I refers  to  perturbation  quantities.  Perturbation  is  caused  by  a gravity 
wave  moving  through  the  atmosphere.  The  vertical  direction  is  denoted  by  the  co- 
ordinate z . The  T matrix  is  a 4 x 4 matrix  with  elements  T...  where  i = 1 to  4, 

ij 

j = 1 to  4 as  shown  in  Equation  (1).  Matrix  elements  for  an  isothermal  atmosphere  in 
diffusive  equilibrium  are  as  follows: 


1 1 


Y H 
'n  o 


T,  = — (1-k^ 
12  X 


no 


u 


1 3 


= Tj4  = 0 


iuY  N^  Y -I 

^21=  ^22=;^.  T23  = T24  = 0 

a (j  'no 

no 

1 2 2 

14  ^21  - 
T — ^ 'T*  . X no  rp 

^31"  H ‘ H ^ ^ Y (-iw  + v ) * -^33  " -H 
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Here  y^^  is  the  ratio  of  specific  heats,  is  the  scale  height  of  the  main 

atmosphere,  u is  the  radian  frequency,  k^  is  the  horizontal  component  of  the  wave- 
number,  a^^  is  the  velocity  of  sound  in  the  major  atmospheric  constituent,  N^  is 
the  buoyancy  or  Brunt- VSrsala  frequency,  is  the  scale  height  of  the  minor 

species,  is  the  collision  frequency  of  a minor  species  particle  with  the  major 

constituent,  p^^  and  are  the  background  pressure  and  density,  respectively,  of 

the  minor  species  particles,  and  g is  the  acceleration  of  gravity. 

On  substituting  = ik^  in  Eq.  (1),  where  k^  is  the  vertical  component  of  the 
wavenumber,  one  may  solve  for  the  ratio  i”  terms  of  P^/ Pq  ^ function 

of  u),  k^  and  k^.  k^  may  then  be  eliminated  on  using  the  dispersion  relationship  for 
gravity  waves  ^^d^ich  relates  k , w and  k . The  ratio  (p  , /p  ) to  (p,  /p  ) is  then 
the  relative  pressure  response  of  the  minor  species  wave  to  the  pressure  perturbation 
of  the  main  atmosphere.  This  response  is  a complex  function  of  u)  and  k^,  and,  one 
may  determine  a phase  and  amplitude  from  this  function.  Though  the  above  yields 
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expressions  for  the  relative  pressures,  this  procedure  was  followed  for  the  relative 
densities,  preferable  because  density  is  actually  measured.  The  details  of  the  dif- 
ferences in  the  matrix  elements  for  these  densities  as  contrasted  with  the  pressures, 
however,  will  not  be  given  here,  but  will  be  published  elsewhere. 

One  may,  nevertheless,  illustrate  the  response  under  simplifying  conditions, 
as  shown  in  Table  I.  For  the  purpose  here  it  is  based  on  the  assumption  that  the 
frequency  w « N^.  There  are  several  solutions  depending  on  the  magnitude  of  the 
real  part  of  k , designated  k , relative  to  the  inverse  scale  height  as  well  as  the 
magnitude  of  the  collision  frequency  (low  altitude  - high  collision  rate)  high  altitude  - 
low  collision  rate).  It  is  seen  in  three  out  of  the  four  cases  in  the  Table  that  the 
response  depends  on  the  mass  ratio  of  the  minor  to  major  species  particles 
Large  phase  shift  is  possible  only  for  light  weight  minor  particles,  in  accordance  with 
satellite  experimental  findings. 


TABLE  I. 
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For  more  complex  cases  ons  must  resort  to  numerical  analysis.  This  was  done 
for  a series  of  background  models  ranging  from  low  to  high  solar  activity.  A typical 
case  is  shown  in  Fig.  1 which  is  for  mean  solar  activity,  at  midnight.  Z50  km  altitude, 
150  minute  period  wave.  The  plot  is  against  the  horizontal  wavelength  k . It  shows 
phase  and  amplitude  for  helium  (He),  oxygen  (O^)  and  argon  (Ar).  It  should  be  noted 
that  all  species  have  relative  amplitudes  greater  than  unity  in  this  range.  The  large 
phase  shift  of  helium  is  quite  evident,  whereas  there  is  little  phase  shift  between 
oxygen  and  argon. 


Fig.  1.  Density  phase  and  amplitude 
relative  to  main  disturbance. 

As  solar  activity  is  reduced  at  250  km  altitude  for  this  period,  it  is  found  that  the 
argon  and  oxygen  responses  hardly  change,  whereas  the  helium  response  becomes 
peaked  at  specific  wavelengths.  Similar  characteristics  are  found  for  other  periods, 
as  well. 

The  results  are  summarized  in  Fig.  2 where  the  control  of  collision  frequency 
is  evident.  This  figure  is  a plot  of  the  peak  of  the  helium  response  as  a function  of 

4 

collision  frequency.  The  various  models  refer  to  CIRA  1965  atmospheric  models. 
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The  higher  is  the  number  of  the  model,  the  hig.ier  the  solar  activity.  The  graph 
illustrates  the  peak  -esponse  of  helium  for  three  wave  periods,  130  minutes,  150 
minutes  and  180  minutes.  The  corresponding  horizontal  wavelengths  for  the  various 
points  are  indicated  by  number  in  the  body  of  the  figure. 


Collision  Frequency-Sec  ' 

Fig.  2.  Relative  peak  response  of  helium  vs.  helium  - 
atomic  oxygen  collision  frequency  for  wave 
periods  of  180,  150  and  130  min. 


Calculations  made  for  an  altitude  of  170  km  show  that  the  phase  and  amplitude 
responses  are  nearly  model  independent.  Nevertheless  the  phase  shifts  of  helium  are 
quite  large.  The  collision  frequencies  at  this  altitude  are  about  an  order  of  magnitude 
greater  than  that  at  250  km,  and  the  absence  of  sensitivity  to  solar  activity,  may  be 
explained  on  this  basis. 

In  conclusion,  calculations  have  shown  that  the  thermosphere  is  capable  of  a 
strong  response  for  minor  species  of  lighter  mass  at  the  proper  range  of  altutudes. 

It  also  shows  that  the  calculated  responses  are  very  much  like  that  found  from 
satellite  measurements  for  the  solar  activity  at  the  time.  Typically,  strong  response 
is  for  periods  - 2 hours.  Helium  tends  to  have  a very  strong  resonant  response  in  the 
200  - 300  km  altitude  range  for  lower  solar  activity,  suggesting  the  likelihood  of  coupling 
to  its  acoustic  wave.  At  higher  solar  activity  the  argon  response  grows  to  dominate, 
though  helium  still  exhibits  large  phase  shift.  The  response  of  argon  is  mainly  in- 
dependent of  atmospheric  models.  The  relative  response  of  minor  species  may 
possible  be  utilized  in  the  future  as  an  indicator  of  local  heating  effects  because  of  the 
sensitivity,  particularly  for  helium,  to  collision  rates. 
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NEW  RESULTS  IN  ELECTRICAL  MACHINE  THEORY:  A FLOQUET- THEORY  OF  THE 
GENERAL,  LINEAR  NON- IDEAL  ROTATING  MACHINE 

D.C.  Youla  and  J.  J.  Bongiorno,  Jr. 

A.  Introduction 

Conceptually,  an  electromagnetic  rotating  machine  is  a collection  of  n windings 
threaded  either  in  air  or  around  ferromagnetic  material  or  both  which  are  disposed  into 
two  groups,  the  stator  and  the  rotor.  The  former  contains  s windings  and  is  stationary 
while  the  latter  contains  the  remaining  n-s=r  windings  and  rotates  as  a rigid  body  about 
an  axis  fixed  in  the  laboratory  frame.  If  eddy  currents,  hysteretic  effects  and  retarda- 
tion are  negligible  it  can  be  assumed  that  the  instantaneous  flux  linking  each  winding  at 
time  t is  a function  of  the  n winding  currents  ij(t),i2(t),  • • • and  a single  angle  0(t) 

which  serves  to  define  the  electro- geometric  configuration.  Thus,  over  a suitable 
dynamic  range  the  flux 

linking  winding  ;^k  is  a single- valued  fxuiction  of  the  n+1  arguments  ij^,i2,  • • ' » ® with 
period  2ir  in  0,  k=l  — n.  Ordinarily,  the  angle  0 is  related  to  true  mechanical  degrees 
<j)  via  a formula  of  generic  type 


(2) 


p the  number  of  poles. 

Let  the  current  ij^(t)  be  supplied  by  the  voltage  source  and  denote  the  mutual 

winding  resistance  by  rj^^  , (k,  f =1  — n) . Then,  by  applying  Kirchoff's  voltage  law  to  each 
winding  we  obtain  the  equations 


^ d*  (t) 


dt 


k=l  — n;  or,  more  compactly, 
v(t)  = Ri(t)  +'f[i(t),0(t)] 

where 


(3) 


(4) 


V = (Vj.V2,...,v^)'  , 

i = (ii.i2.-”.in>'  ' 


i<.)  = ^ 


(5) 
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are  column-vectors  ('  denotes  matrix  transpose)  and 

R(0)  = (6) 

is  the  associated  nxn  resistance  matrix.  Omitting  the  argument  t, 

V = R(0)i  +’i'(i,0)  , (7) 

the  preferred  working  form  of  Equation  (3).  As  is  well  known,  in  a linear  machine  flux 
and  current  satisfy  the  linear  relationship 

*(i,0)  = L(0)i  . (8) 

in  which  L(0)  is  the  nxn  inductance  matrix.  In  general,^  R(0)  and  L(0)  are  real,  of  peri- 
od 2ir  in  0 and  symmetric  nonnegative  and  positive- definite,  respectively;  i.e., 

R(0)  = R(0+2Tr),  L(0)  = L(0+2tt),  R'(0)  = R(0)>O  and  L'(0)  = L(0)>  0 for  all  0. 

The  substitution  i=L  converts  Eq.  (7)  into  the  first-order  linear  vector  differ- 
ential equation  for  flux, 

= - A(0)’*'  + V (9) 

whose  defining  coefficient  matrix 

A(0)  = R(0)L  ^0)  (10) 

is  evidently  of  period  Ztt  . However,  because  A is  a function  of  0 instead  of  0,  this 
description  is  time- variable  even  under  the  condition  of  constant  angular  velocity.  Un- 
fortunately, if  the  machine  is  not  assumed  to  be  ideal  in  the  sense  of  Park  (Appendix), 
this  lack  of  stationarity  cannot  be  transformed  away  by  any  obvious  change  of  reference 
frame.  Nevertheless,  such  a frame,  the  Floquet  frame  does  exist. 

B.  The  Floquet  Frame 

Let  us  imagine  that  we  attach  to  the  rotor  n open- circuited  exploratory  coils  whose 
various  coefficients  of  coupling  are  permitted  to  depend  explicitly  on  both  0(t)  and  0(t). 
Then,  it  is  reasonable  to  assume  that  the  flux  vector  sampled  by  these  coils  has 

the  form 

>I'j^(t)  = T(0(t),  0(t))’®'(t)  (11) 

where  the  nxn  matrix  T(0,^)  possesses  the  following  properties: 

Pj.  For  every  fixed  p>0,  T(0,p,)  is  real,  nonsingular  and  of  period  Ett  in  0. 

P^.  T(0,p)  has  continuous  partial  derivatives  with  respect  to  0 and  fi.  Any 

matrix  T(0,p,)  satisfying  P^  and  P^  is  said  to  be  admissible  and  defines  an 
admissible  reference  frame. 
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Using  Eq.  (11)  to  eliminate  'i!  in  Eq.  (9)  we  easily  obtain  the  differential  equation 

for 


*N  = (6T0T-^  - TAT'^  +0T^T-S^j^  +v^  • 

(12) 

In  Eq.  (12), 

Vn  = Vj^(t)  = T(0(t),  0(t))v(t)  , 

(13) 

_ 9T(9(t).6(t)) 

^0  - 00 

(14) 

and 

_ 0T(9(t).0(t)) 

(15) 

H dn 

(^j^(t),  Vj^{t),  etc.,  are  "normalized"  vectors. ) 


2 

According  to  a result  of  Erugin,  for  all  sufficiently  large  there  exists  an  admis- 
sible matrix  T(9,fi)  such  that 

T(e,^l)A(e)T■^0,^)  - = nc(n)  (i6) 

where  C(fi)  is  real  and  independent  of  6.  In  fact  T and  C admit  Laurent  expansions  in 
l/fj,  valid  in  some  neighborhood  of  p = *>.  More  explicitly, 

T(e,fi)=  Yj  T.(e)n'^  (17) 

k=0 

and 

C(fx)  = E 

k =0  ^ 

where  (1  is  the  identity  and  0 is  the  zero  matrix), 

Fj)  all  Tj^(®)  real,  nxn,  independent  of  fi  and  of  period  Zir  in  9,  k=0— ►<». 

F^)  Tq(0)  = 1 and  Tj^(O)  =0,  k=l-«. 

Fj)  All  Cj^  are  real,  nxn  and  constant,  k=0-*<». 

Under  such  a choice  for  T(9,fi),  Eq.  (12)  collapses  into  the  simpler  form 

= -(0C(0)  - + Vj^  . (19) 

In  particular,  for  constant  rotor  velocity,  0(t)E  0 and  Eq.  (19)  reduces  to 


(20) 
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a constant- coefficient  differential  equation.  We  have  therefore  achieved  out  desired  ob- 
jective and  established  a Park- like  circuit  description  for  the  general  linear  rotating 
machine  which  is  valid  in  some  Floquet  frame  characterized  by  the  pair  of  Floquet 
matrix  parameters  T(0, /i),  C(tl). 


There  exists  a class  of  machines,  known  as  ideal,  for  which  T(6,fji)  can  be  chosen 
independent  of  For  this  class,  which  includes  the  Park  machine  (Appendix), 


C 0 -C  0 
A(0)  = e ^ Cje  ^ 


(21) 


T(0.jt)  = Tq(0)  = e 


-s® 


(22) 


and 


C(fi)  - Cq  + C j 


(23) 


where  Cq  and  C^  are  two  real  nxn  matrices.  Clearly,  for  such  a machine,  the  0 term 
in  Eq,  (19)  vanishes  rigorously  and  the  normalized  differential  equation  reduces  to 


N 


(24) 


The  question  of  how  large  /x  should  actually  be  to  assure  the  convergence  of  the  ex- 
pansions inEqs.  (17)  and  (18)  is  a difficult  one  to  answer  quantitatively.  Nevertheless, 
since  ^ is  always  identified  with  0(t)  = to  =2'irf!=  377  radians/sec.  for  60  Hz  machines, 

» 1 and  we  conjecture  that  Eqs.  (17)  and  (18)  converge  in  most  practical  cases. 


Accepting  the  truth  of  the  above  hypothesis,  it  is  easily  seen  that  under  the  same 
conditions  and  in  any  suitable  norm, 

■0|  Ul 


|10T  (0.6)T'^9,6)||-  . 2 

M (0)^ 


(25) 


is  a compatible  estimate.  Consequently,  because  of  the  minuteness  of  the  ratio  |oj  ( /to 
it  appears  that  the  acceleration  term  in  Eg.  (19)  should  also  be  negligible  in  practical 


machines.  Thus,  we  have  supplied  a plausible  physico- mathematical  basis  for  the 
following  extrapolation. 


The  Practical- Machine  Theorem.  For  practical  synchronous  machines  there 
exists  a suitable  Floquet  frame  in  which  the  circuit  description  of  the  machine  assumes 
the  extremely  accurate  acceleration- independent  form. 


♦ 


N 


■ &C(6)*J^+Vj^ 


(26) 


This  is  a linear  differential  equation  whose  coefficient  matrix  depends  solely  on  6(t). 


Of  course,  A(0)  must  have  period  Zn 
skew- symmetric  and  Tq(0)  is  orthogonal  (Appendix). 


Incidentally,  for  the  Park  machine  Cq  is 
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C.  An  Experimental  Plus  Theoretical  Overview 
According  to  Eqs.  (17)  and  (18), 


and 


(27) 


T^(e)  j 

^ + terms  of  order  (”2)  . (28) 

Hence,  for  &(t)— <», 

0C(e)-eCg+Cj  , (29) 

T(e,0)-TQ(e)  . (30) 

In  other  words,  as  0 = u)  — every  machine  behaves  like  an  ideal  machine  whose 
A(0)  matrix  is  given  by 


S®  ‘S® 

A„(0)  = e ^ C^e  ® 


(31) 


This  iifimediately  suggests  that  there  should  exist  experimental  procedures  for  the 
measurement  of  C^  and  Cj. 

Equally  important,  however,  and  quite  consistent  with  the  Practical- Machine 
theorem,  is  the  real  possibility  that  the  approximations 

&C(6)  - 6Cq  + C^  (32) 

and 

T,(0) 

T(0,0)  - T (0)  +— t (33) 


are  sufficiently  accurate  over  the  actual  ranges  of  working  frequencies  of  most  practical 
machines. 

Naturally,  a substantiation  of  this  conjecture  must  involve  an  in-depth  study  of 

the  properties  of  the  C 's  and  T 's  and  their  relationship  to  the  given  machine  data, 

- 1 ^ 

A(0)  =R(0)L  (0).  To  give  an  idea  of  what  is  involved  we  shall  derive  the  necessary 

recursions  amd  offer  a brief  analysis  of  the  attendant  difficulties. 

Consider  the  periodic  differential  equation 


* 


Currently  under  investigation. 
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= • X(e)A(e) 


(34) 


and  let  us  seek  to  construct  a fundamental  matrix  solution  of  the  form 

X{Q,n)  = 


(35) 


wherein  T(6,fi)  and  C{^^)  possess  the  expansions  Eqs.  (17)  and  (18),  respectively,  and 
the  enumerated  properties  F,)-F-).  The  substitution  of  Eq.  (35)  into  Eq.  (34)  yields 

SjC  1 J 

immediately 


Tq  = -CT  + - TA 

e p 


(36) 


and  upon  equating  coefficients  of  like  powers  of  l/p  on  both  sides  we  obtain  the  coupled 
system  of  equations. 


, ,JP  = .c  X 

d0  0 0 


(37) 


dT  r 

= - Zj  C,  T . + T , a,  r > 1 
dO  k r-k  r- 1 — 

k=0 


(38) 


The  solution  of  Eq.  (37)  subject  to  the  initial  condition  Tq(0)  = 1 is  obviously 


To(e)  = e 


-V 


(39) 


However,  the  solutions  of  Eq.  (38)  subject  to  the  initial  conditions  T^(0)=0,  r = l-*oo, 
are  best  expressed  in  normalized  form.  Thus,  for  r=0-»oo,  let 


C 0 

Z^(0)  = t‘^0)T^(0)  = e ° T^(0) 


(40) 


and 


M^(0)  = T’\0)C^Tq(0) 


(41) 


Then,  omitting  the  straightforward  details,  we  find  that 


0 


Z,(0)  = / (A(x)  - M,(x))dx  , 


(42) 


and  for  r ^ 1, 


Z^(0)  = / (Z^  ,(x)A(x)-M  (x)Z  (x)  - Yj  M (x)Z  (X)- M (x))dx  . (43) 

' 0 1 r-i  k;^0,l,r  ^ 


Equation  (16)  is  a rearranged  version  of  Equation  (36). 


r 
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The  algorithm  is  initialized  by  first  selecting  the  real  matrix  so  that  Tq(0)  has 
period  Zir , Then,  since  the  integral  of  a periodic  function  is  periodic  iff  the  function 
has  zero  average  value,  it  follows  from  Eq.  (42)  that  Tj(0),  and  therefore  2^(0),  has 
period  Zir  iff 


Ztt  Z-n  C 0 -C  0 

/ A(0)d0  = J e ^ C.e  ^ d0 

0 0 * 


(44) 


This  average-value  constraint  determines  an  acceptable  C^  (if  one  exists).  In  general, 
having  determined  Cq,  Cj,  • • • , C^  j,  we  obtain  C^  by  applying  the  same  average- value 
argument  to  Eq.  (43): 

Ztt  r 2n  C 0 -C  0 

/ (Z  (0)A(0)-M  (0)Z^  ,(0)-  (0))d0  = / e ^ Ce  d0  , 

0 kj^O.l.T  0 ^ 


r > 1 


(45) 


The  "best"  physical  choice  for  Cg  is  not  at  all  apparent  although  the  simplest 
mathematical  choice  that  always  works  is  Cg  = 0.  Under  this  initialization,  Tg(0)  = 1, 


. 2ir  0 

~ f A(x)dx  = C,;  T,(0)  = f A(x)dx  - 0C, 

2Tr  Jg  1 1 Jg  1 


Ztt 


and 


27  /„  (T,.i(e)A(e)  - C,T^  (0)  - I = c_ 

0 k/0, 1, r 


0 k^O,  1, r 

r > 1 . 


(46) 


(47) 


(48) 


To  illustrate  the  inapprooriateness  of  the  initialization  C.  =0,  consider  the  ideal  machine 

-Co0  Cg0 

with  A matrix  A(0)  =e  ® » Equation  (21).  Using  this  Cg(/0)  and  Cj  we  obtain 

the  exact  one  and  two- term  expressions 


and 


T(6,m)  = T.(0)  = e 


C(M)  = Cg  + 


(49) 


(50) 
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But  unfortunately,  the  initial  choice  = 0 leads,  via  Eqs.  (46)- (48),  to  two  infinite- 
series  expansions  in  l/jl  for  both  T(0,p)  and  C(jl). 

Our  studies  up  to  this  point  seem  to  suggest  that  an  "optimal"  choice  of  Cq  should 
probably  be  predicated  on  the  minimization  of  some  error  norm. 


II  A(e)  - 


II  A(e)  - A^(e)  II 


(51) 


measuring  the  "distance"  from  A(9)  to  its  very-high-frequency  limit,  A|jp(6).  We  are  con- 
fident of  the  correctness  of  this  approach  and  any  new  developments  will  be  documented 
in  future  progress  reports. 

D.  Appendix;  The  Ideal  Park  Machine 

Park's  ideal  3-phase  machine  occupies  a central  place  in  the  conventional  theory 
and  continues  to  play  a key  role  in  studies  involving  interconnected  power  systems.^ 

For  the  reader's  benefit  and  for  future  reference  we  shall  give  explicit  formulas  for  all 
the  pertinent  design  data  and  all  the  Floquet  parameters  without  neglecting  either 
damper-bar  inductance  or  damper-bar  resistance. 

Pkl)  The  three  identical  stator  phases  are  denoted  by  "a,  " "b,  " "c"  and  are  as- 
sumed to  possess  the  same  constant  ohmic  resistance  r^>0.  The  fxf,  real,  constant 
symmetric  nonnegative-definite  matrix  R incorporates  all  field- winding  loss  and  all 

u.  r 

damper-bar  resistance.  Thus,^ 

R(0)  = R = r^l^  + R^  = R' > 0 . (52) 


Pj^^)  The  spacial  separation  bet, ween  contiguous  phases  is  equivalent  to  120  elec- 
trical degrees.  Let  0 denote  the  electrical  angle  relative  to  phase  a.  Then, 


L(0)  = 


L (0) 

<D 

ss 

sr 

L'  (0) 

sr'  ' 

L (9) 
rr'  ' 

= L'(6)  > 0 


(53) 


where 


L 

ss 


(0) 


1^^  (0+120°)  L^^(0-12O°) 

Lbc(®-120°)  Lbc<®) 


L 

aa 


(0  - 120°) 

(0) 

(0  +120°) 


(54) 


A+B  is  the  "direct  sum"  of  matrices  A and 


B. 


sr 
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L^^(0-12O®)  L^j^(e-120°)  L^^(e-i20°)| 

Laj{e+120°)  Laj^(e+120°)  Laj^(e  + 1 20°)  I 
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(5J 


L 


^ff  ^fk 

- 

^fk  ^kk 

0 

0 

Hih 

- 

. 

= L' 


rr 


(56 


The  matrix  L^^  is  constant  and  symmetric  nonnegative- definite  L is  th.  ^ 

ance  of  the  field  r r ^ ® L is  the  self-mduct 


ance  of  the  field  and  L^,  are  square  matrices  whose  order's  ar'e  detJrrin^rrtl 
Laf  = cos  0 , Laj^  = cos  0 . Laj^  = sin  0 

and 


(57) 


^aa  ■ ^s  ■*■  ^0  °°®  20  , Lj^^  = "Mg  + cos  20 


where  M,  M,  .re  1-rowed  metricee  „irh  .o„-„eg.«e,  elements  and 
Mf  > 0;  L^  > Mq  > 0;  > Mq  > 0 


(58) 


(59) 


Pj^3)  Let 


M (0) 
P 


=vr- 


_i_ 

vr 


‘ ^ 1 

^ vr 

cos  0 cos(0-12O°)  cos(0+12O°) 

-sin0  -sin(0-12O°)  -sin(0+I2O°) 


(60) 


n = 


'0  0 0 

0 0 -1 

.0  1 0 


-n' 


(61) 


nd 


Tp(0)  = Mp(0)  4-  1^ 


'hen. 


‘3+f  = T'(0)Tp(0)  , 

Cq  = T^(0)(n  + 0^)T  (0) 


(62) 

(63) 


1641 


ELECTRIC  POWER  ENGINEERING 


239 


k ■ * ■ ' 


- ’’k”!;)''  + \i\\h  - 


Then, 


“ I, 


<^f.k^d>’  ^k  * 


Rr  = tR^  I Rr  ] 


in  which  R and  ^ ' have  the  same  number  of  columns  and  similarly  for  R and  t ' . 

ri  _ k r h 
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DESIGN  OF  INDUCTOR  SYNCHRONOUS  MOTORS 
E.  Levi,  L.  Birenbaum  and  Z.  Zabar 

Recently,  new  interest  in  the  design  of  electrical  machinery  has  been  spurred  by 
the  development  of  solid  state  inverters,  and  by  studies  of  vehicle  propulsion  using 
linear  electric  motors.^  In  this  report,  a new  approach  to  the  design  of  a special  type 
of  synchronous  machine,  the  homopolar  inductor  motor,  is  described.  This  design 
method  is  especially  suited  for  use  with  current- source-fed  linear  machines. 

A.  Construction  and  Control  Aspects 

Conventional  synchronous  machines  are  built  with  the  field  winding,  excited  by 
DC,  located  on  the  rotating  part  (rotor)  while  the  AC  armature  winding  is  placed  on  the 
stationary  part  (stator).  The  inductor  synchronous  machine,  on  the  other  hand,  has 
both  the  field  and  armature  windings  on  the  stator.  Since  the  rotor  has  no  winding,  it 
can  be  made  as  a relatively  sturdy  and  simple  structure.  Linear  versions  of  this  ma- 
chine have  been  proposed  for  train  propulsion.  In  this  case,  all  of  the  windings  may  be 
placed  on  the  undercarriage  of  the  vehicle,  and  the  track  structure  used  as  the  counter- 
part of  the  rotor  on  the  rotating  machine.  A sketch  of  an  idealized  linear  inductor  motor 
is  shown  in  Figure  1. 


1 


Fig.  1.  Sketch  of  an  idealized  linear  inductor  motor. 


The  mechanism  by  which  thrust  is  produced  is  the  following:  3-phase  currents 
are  fed  into  the  armature  winding,  and  create  a moving  magnetic  field.  The  frequency 
• - - choaen  that  the  linear  speed  of  the  field  is  precisely  equal  to  the  speed  of  motion, 

■ ;^>p<j«itely  directed.  Hence  the  armature- induced  field  is  at  rest  with  respect  to  the 
• tr...  ture,  which  has  poles  induced  by  the  field  winding  on  the  vehicle  above  it. 
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Only  when  the  armature  and  field-produced  magnetic  configurations  are  at  rest  is  thrust 
possible  in  a synchronous  machine. 

Speed  control  is  achieved  by  varying  the  frequency  of  the  S-phase  currents,  sup-  • 

plied  by  a current- source  inverter.  Stable  operation  in  this  mode  is  obtained  by  use  of 

a control  system  (Fig.  2)  that  accurately  relates  the  inverter  frequency  to  the  vehicle 

speed,  and  in  addition  controls  the  phase  angle  between  the  track  pole  position  and  the 

2 3 

armature  current-induced  magnetic  field.  ’ 

B.  General  Design  Aspects  , 

A list  of  symbols  used  in  this,  and  the  following  sections,  is  given  below. 


A 

B' 


"pf 


armature  surface  area 

rms  fundamental  normalized 
terminal  voltage 

rms  fundamental  armature  excited 
flux  density  (^i^T/  g tt  )K^ 

rms  fundamental  component  of 
flux  density  in  air  gap  due  to 
field  winding 

rms  fundamental  component  of 
flux  density  in  air  gap 

flux  density  at  the  pole  face 
maximum  allowable  flux  density 


pf  max  pole  face 


maximum  allov/able  flux  density 
in  armature  teeth 

= direct  axis  ar- 
mature reactance 


X Jx 
ad^  1 


ad 


X /X 
aq  n 


X - quadrature  axis 
armature  react- 
ance 


ratio  of  maximum  value  of  funda- 
mental component  of  field-induced 
B to  value  under  pole 

distance  from  center  of  pole  face 
effective  air  gap  length 


*f 

K 


9' 

4^ 


DC  current  in  the  field  winding 

rms  armature  surface  current 
density 

pole  face  width 

♦urns/pole  in  the  field  winding 

power  per  unit  area  of  arma- 
ture surface,  P/A 

thrust  per  unit  area  of  arma- 
ture surface 

velocity 
tooth  width 

leakage  reactance  of  armature 
magnetizing  reactance 

Zo  = w(i  /t)  s angular  span  of 
P pole  face 

permeability  of  free  space 
4ir  X 

pole  pitch 
slot  pitch 

angle  between  armature  current 
and 

g 

angle  between  direct  axis  and 
armature  current  maximum 


Usually,  an  electrical  machine  is  designed  by  a trial  and  error  process.  First, 
the  geometry,  dimensions  and  the  magnetic  and  electric  loading  are  chosen.  From 
these  are  derived  the  terminal  characteristics.  These  steps  are  then  repeated  until 
the  design  specifications  are  met.  Here,  instead,  the  starting  point  is  the  set  of  design 
specifications;  these  are  then  used  to  determine  the  magnetic  and  electric  stresses; 
then  the  geometry,  and  finally  the  dimensions. 
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Fig.  2.  Block  diagram  of  a control  system  for  the  LSM, 

In  the  inductor  machine,  an  average  component  of  B is  present  in  addition  to  the 
useful  alternating  component.  The  flux  density  alternates  around  this  average  value, 
instead  of  between  positive  and  negative  peaks,  as  in  a conventional  synchronous  ma- 
chine. Hence  the  machine  is  prone  to  saturation  of  the  magnetic  flux  path.  The  arma- 
ture teeth  are  especially  heavily  stressed.  It  is  important  to  avoid  saturation  of  the 
salient  pole  structure,  since  the  machine  relies  on  variable  reluctance  effects  for  its 
operation. 

The  armature  tooth  saturation  problem  may  be  partially  overcome  by  using  a 
positive  value  for  the  angle  ij'  (see  Fig.  3),  between  the  direct  axis  and  the  maximum 
of  the  armature  surface  current  density  K^.  This  causes  the  fundamental  components 
of  the  armature  and  field-induced  magnetic  flux  densities  (B^  and  B^),  or  more  precisely 
their  mmf's,  to  subtract  along  the  pole  face,  thus  reducing  tooth  saturation  in  the  sec- 
tion of  the  armature  directly  above  the  pole. 

If  the  design  is  based  on  unity  power  factor,  motor  size  is  minimized  for  a given 
output  power.  This  is  done  here. 

C.  Specific  Design  Aspects 

As  the  starting  point,  we  suppose  that  the  maximum  allowable  flux  density  in  the 

armature  teeth  (B  ) is  specified.  This,  in  turn,  determines  the  maximum  flux 

m&x 

density  at  the  pole  face  (B  , ): 


'^t 

B “ ^ B 

pf,  max  T t,  max 


Here,  is  the  slot  pitch  and  w^  the  tooth  width. 


(1) 


•In  a 


V 

li 

' ' 
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Fig.  3,  Relation  between  fundamental  components  of  armature 
and  field  produced  magnetic  fields  for  4>  > 0 . 

The  flux  density  at  any  point  above  the  pole  face  may  be  calculated  using 
Ampere's  law,  and  is  found  to  be 


B 


where  the  first  term  is  related  to  the  ampere- turns/pole,  N^l£*  second  is  related 

to  the  RMS  armature  surface  current  density  K . (t  is  the  pole  pitch,  or  half  the  dis- 

el 

tance  between  successive  N poles;  g is  the  effective  air  gap,  i.e.,  the  effective  separa- 
tion between  the  pole  surface  and  the  under  surface  of  the  armature  core. ) At  th^ 
leading  pole  edge,  where  B^^  is  greatest,  d becomes  equal  to  a,  where  2a  tt  is 
the  angular  span  of  the  pole  face. 


From  Eq.  (2),  it  is  seen  that  a choice  of  il>  > a avoids  the  situation  where  the 

armature  and  field  mmf's  are  additive.  Also  a should  be  reasonably  small,  since  this 

decreases  the  permeance  of  the  interpolar  space,  and  hence  reduces  the  transverse 

flux.  As  a result,  the  weights  of  the  field  yoke  and  of  the  pole  structure  are  both  re- 

o 4 5 

duced.  An  optimum  value  of  a is  about  50  . ’ 


{ 

i 

( 


1 
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The  Blondell  2-reactance  theory^  can  now  be  used  to  describe  the  machine  behav- 
ior. A phasor  diagram,  normalized  to  show  the  (RMS)  fundamental  components  of  the 
magnetic  field  quantities,  is  shown  in  Figure  4. 


Fig.  4.  Phasor  diagram  for  salient-pole  synchronous 
motor  for  unity  power-factor. 


From  this  phasor  diagram  the  following  identity  can  be  derived: 

X fi  T , jU  T 

OA  = + C ) — ri-r  K = B,  - (C  , - C ) -2-  K sin 

X q TTC  sin  ili  a f d q irg  a 


(3) 


and  solving  for  K^, 


K 


U T 

'^O 


(C  , - C ) sin  4j  + ( 


1 

X ^ sin 
m ^ ^ 


(4) 


A relation  is  thus  established  between  K and  B,  which  are  indices  of  the  electric  and 

a f 

magnetic  loading,  respectively. 


If  A is  the  area  of  the  air  gap,  v the  velocity  of  relative  motion  between  the  arma- 
ture and  the  polar  structure,  the  electromagnetic  force  developed  per  unit  area  of  the 
air  gap  is: 

t = = K • B cos  *'  = K • OA  cos  ib  (5) 

■ Av  a g a ^ 


The  terms  C^  and  C^  are  proportional  to  the  direct  and  quadrature  saturated  re- 
actances, and  their  difference  is  present  in  Equation  (4).  The  approximation  is  now 
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made  that  the  first  term  in  the  denominator,  (C^  - C^)  sin  is  small 
with  the  second.  This  permits  the  following  simplified  expression  for 
unit  area  to  be  obtained  when  Eqs.  (3)  and  (4)  are  substituted  into  Eq. 


in  comparison 
the  thrust  per 

(5): 


sin  2i\t 
2 


(6) 


The  force  density  is  maximum  for  ijj  = 45°,  a value  which  is  not  too  different  from  the 
optimal  value  of  a . 


If  is  set  to  a value  equal  to  the  angle  a , no  additive  mmf  will  be  present,  even 
at  the  leading  pole  edge  (Equation  (2)).  When  a is  close  to  its  optimum  value,  this 
choice  of  ijj  also  maximizes  the  thrust-per-unit  area,  and  hence,  for  a specified  power 
output  tends  to  lead  to  a machine  of  minimum  size. 


D.  Conclusion 


It  has  been  shown  that  one  may  approach  the  design  of  an  inductor  motor  by  spec- 
ifying an  upper  limit  for  the  flux  density  in  the  armature  teeth,  and  then  proceeding, 

almost  directly,  to  obtain  a thrust  per  unit  area,  and  hence  a machine  size. 

Department  of  Transportation 

DOT-FR- 30030  and  DOT-FR-64227  E.  Levi,  L.  Birenbaum  and  Z.  Zabar 
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MAXIMUM  A POSTERIORI  ESTIMATOR  FOR  SUPPRESSION  OF  INTERCHANNEL 
INTERFERENCE  IN  FM  RECEIVERS 

F.  A.  Cassara  and  H.  Schachter 

A.  Introduction 

In  a previous  work^  experimental  evaluation  of  a novel  FM  detector  capable  of 
suppressing  interchannel  interference  in  FM  receivers  was  presented.  In  this  report 
it  is  theoretically  demonstrated  that  the  receiver  structure  experimentally  studied  repre 
sents  the  maximum  a posteriori  (MAP)  estimator  for  the  phase  of  the  desired  received 
FM  signal  when  this  signal  is  simultaneously  corrupted  by  a co-channel  interferer  and 
additive  white  Gaussian  noise. 

B . Optimum  Receiver 

Consider  a received  FM  signal  with  the  interchannel  interference  and  additive 
Gaussian  noise  which  may  be  represented  mathematically  as 

v(t)  = Sj{t)  + S2(t)  + n{t)  , 0 < t < T (1) 

where 

t 

Sj(t)  = Aj  cos  'wjt  + j m^(u)  du^ 

t 

S2(t)  = A^  cos  I 

n(t)  is  white  Gaussian  noise  with  zero  mean  and  normalized  two  sided  power 

spectral  density  of  1 watt/Hz. 

mj(t)  is  the  desired  signal  modulation 

m2(t)  is  the  modulation  on  the  interfering  FM  wave 

Aj  and  are  assumed  constant 

m,(t)  and  m,(t)  are  assumed  to  be  independent  stationary  Gaussian  processes  with  zero 
mean  and  autocorrelation  R^  (t)  and  R^  (t).  We  'esignate 


t 


(t)  = J mj(u)  du 

(2) 

t 

(t)  = j m2(u)  du 

(3) 

This  converts  the  FM  waves  Sj(t)  and  S2(t)  into  angle  modulated  waves  with  x^(t)  and 
X2(t)  as  modulating  signals.  It  is  seen  that  Xj(t)  and  x^Ct)  are  also  mutually  indepen- 
dent zero  mean  Gaussian  random  processes.  Let  the  autocorrelation  functions  of  Xj(t) 
and  x,(t)  be  denoted  by  R (t)  and  R (t). 

e,  Xj 
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We  now  derive  a MAP  receiver  for  the  optimvim  reception  of  mj(t).  Van  Trees 
has  proved  that  a linear  operation  on  the  MAP  estimate  of  a continuous  random  process 
is  equal  to  the  MAP  estimate  of  the  output  of  the  linear  operation  performed  on  the 
random  process.  Therefore  MAP  estimation  of  Xj(t)  and  subsequent  differentiation  of 
the  estimate  x (t)  will  give  the  MAP  estimate  m , (t)  of  m,(t).  We  follow  this  procedure. 

X X X ^ 

We  use  abstract  vector  space  methods  throughout  as  developed  by  Schwartz.  We  do 
not  indicate  the  time -dependence  of  variables  in  the  interest  of  brevity  unless  where 
explicitly  necessary. 

3 


The  controlling  relation  for  MAP  estimation  of  x^(t)  is' 
Sp(Xj |v) 


9x, 


= 0 


(4) 


where  p(Xj  |v)  is  the  conditional  probability  density  function  of  Xj  given  the  received 
signal  v(t).  Equation  (4)  can  be  written  in  the  following  equivalent  form 


a [logp(Xj  |v)  ] 

ST 


= 0 


(5) 


Using  the  relation 

p(xjv)  = p(v|xj)  • p(xj)/p(v) 

and 

= 0 

axi 

we  obtain 

Blog  p(v/x^) 

Bx, 


Blog  p(Xj) 
Bx, 


= 0 


(6) 


The  first  part  of  Eq.  (6)  takes  the  form 
log  p(vlxj) 


B 

Bx, 


= • 3I7  _/  P<^l^i'  ^2) 

p(vjx,y  I P(x2)kjexp{-i((v-Sj-S2).R;^v-Sj-S2))} 

' 1 ' 1-00 


dx^ 


(7) 


where  kj  is  a constant  and  (a,  b)  denotes  inner  product  of  a and  b given  by 
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T 

(a,  h)  = f a(t)  b(t)  dt. 

0 

- 1 3 

The  operator  R in  Eq.  (7)  is  defined  by 

T 

R'^  (v  - Sj  - s^)  = Rj^^  (t,  n)  v(m)  - Sj(p)  - jd^i  (8) 

or  equivalently, 

T - 

■ ®l  ■ ®2  " ^n^*’  :^n 

For  white  additive  noise,  M)  * 6(t-  fj.)  so  that 


R ^ v-s,-s-  =v-s,  -s_  (9) 

n L 1 2 _ 1 2 

Substituting  Eq.  (9)  into  Eq.  (7)  and  carrying  out  the  indicated  differentiation  with 
respect  to  Xj  produces 


log  p(v 


xi) 


r 1 

p(x2)p(v1xj,X2)  yv-Sj)  ^dtdx^ 
ki  “ 3s^ 

.-L  ^2^  -^0  ^ 


(10) 


The  first  expression  in  Eq.  (10)  is  easily  simplified  by  integrating  with  respect 
to  x^.  In  the  second  expression  we  note  that,  if  x^  and  x^  are  statistically  independent, 
we  have 


p(x2)p(vlxj,  x^) 

= P<^2l^'  *l) 


With  this,  Eq.  (10)  is  reduced  to 
3 logp(vlxj) 


Bx, 


T 


ds. 


- i!i 

*2  ax, 


dt 


where 


= J P(x2l'^'  ’‘l^  ®2  *^2 


(11) 


(12) 


(13) 


is  recognized  as  the  conditional  mean  and  hence  the  minimum  mean  square  estimate  of 

S2(t). 
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Now  following  the  second  term  in  Eq.  (6) 

3 1 t \ S , 1 I -1  IS  _ -1  „ -1 

^ logP(^l)  = _-2  J;=  -2^  '^l‘^x,^l/=-^Xj^l 


Again  making  use  of  Eq.  (8)  to  simplify  Eq.  (14)  and  adding  the  result  to  Eq.  (12)  as 
per  Eq.  (6)  we  have  after  an  inverse  transformation 

Xi(t)  = kj  R^^(t.  u)  (V  - s^  - 7^)  du  (15) 

Solutions  of  Eq.  (15)  for  Xj(t)  are  the  MAP  estimates,  we  denote  these  by  Xj(t). 
Further,  in  Eq.  (15)  it  is  noted  that  the  integral  on  R.  H.  S.  is  a convolution  integral  in 
which  the  filter  R^  (t,  u)  is  low-pass,  whereas  the  term  s^^  Bs^/oXj^  involves  double  the 
carrier  frequency  ^erms  and  hence  would  integrate  to  zero.  Equation  (15)  therefore 
simplifies  to 


r'^  S®1 

^1^*)  ^ f ®-x  TT  • (v  - s^)  du  (16; 

o 1 1 

Equation  (16)  clearly  shows  a phase  lock  loop  structure  with  S2(t)  as  an  external 
input  as  shown  in  Figure  1. 


^v-Sg) 

H 

RxjO.u) 

S2(t) 

u, 

dx, 

— 

PCO 

^ S,(t)  = A,  cos(«,t+X,(t)) 

3 

PCO  ■•—I  ^ = -A|  Sin  (w,t+X,(t)) 

PCO:  Phase  Controlled  Oscillator 

Fig.  1 Optimum  receiver  for  Xj(t). 

Following  all  steps  from  Eq.  (7)  through  Eq.  (16)  but  estimating  X2(t)  instead  of 
Xj(t)  we  ob '.ain 

T 9 s_  _ 

^2(4)  = k2  j R (t,  u)  ^ . (v  - Sj)  du  (17 

o 2 2 

This  is  a receiver  identical  to  Fig.  1 except  that  S2  is  replaced  by  Sj  and  Rjjj(t,  |i)  is 
replaced  by  R^^(t,  u).  If  we  assume  s^  and  S2  a-re  suitable  estimates  of  s^  and  S2, 
respectively,  i.e,  , the  m.m.s.e.  are  approximately  equal  to  the  MAP  estimates. 
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then  the  receiver  of  Fig.  1 and  its  equivalent  for  obtaining  ^^(t)  can  be  coupled  to  obtain 
a comprehensive  receiver  with  no  unknown  inputs.  Recalling  that  in  the  case  of  FM, 
the  modulation  functions  are  m^{t)  = Xj{t)  and  replacing  the  phase  con- 

trolled voltage  controlled  oscillator  (VCO)  of  Fig.  1 by  a frequency  controlled  VCO, 
such  a cross  coupled  FM  receiver  appears  as  in  Figure  2.  It  must  be  emphasized  that 
only  when  Sj(t)  = s^(t)  and  82(1)  = s^Ct)  will  this  receiver  be  optimum. 


^Si 


-r-^  = — A|  Sin  (w|t  + f m|(u)  du) 
h|(t,t')=  (t,u)du 


- A,  Sin  ((ugt  +/  ^2^“^ 


Fig.  2.  Novel  FM  detector  for  suppression  of  interchannel  interference. 

Figure  2 describes  the  physical  receiver  structure  experimentally  evaluated 
during  the  previous  research  period  and  discussed  in  Reference  1. 
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A NEW  APPROACH  TO  THE  EDGEWORTH  SERIES  APPROXIMATION  WITH  APPLICA- 
TION TO  THE  BINOMIAL  DISTRIBUTION 

C.Y.  Chang  and  L.  Kurz 

In  evaluating  the  performance  of  some  communication,  radar  and  sonar  systems, 
as  well  as  in  reliability  and  quality  control  work,  frequently  it  is  necessary  to  find  ac- 
curate expressions  for  the  binomial  distribution.  In  the  past,  Edgeworth  series 
approximations^  have  been  used  with  success  but  this  approach  fails  when  the  probability 
of  success  p <,  . 2 for  the  four-term  series  calculated  so  far.  In  this  report,  firstly, 
the  Edgeworth  series  is  extended  to  include  nine  terms,  which  yields  excellent  accuracy 
for  p > . 1 and  m > 36;  secondly,  a technique  of  shift- accuracy- interval  (TSAI)  is  intro- 
duced which  guarantees  high  accuracy  even  for  a four- term  Edgeworth  series  (10 
percent  error  is  easily  achievable).  The  technique  used  to  generate  P^  ^ polynomials 
for  the  binomial  distribution  is  sufficiently  general  to  indicate  applicability  to  other 
distributions. 

A.  The  Edgeworth  Series 

Consider  the  sum  of  n i.i.d,  r.v. 


with  means,  nrij^,  and  variance  o-^.  Introducing  the  standardized  r.v.  x = ^ - m/ir,  where 
m = nmj  and  (r  = a we  have  an  approximation  to  the  p.d.f.  of  x 


c c c 

f(x)  = c^<|>(x)  +-j^  <t>'(x)  ^ <t."’(x)  + 


(1) 


where 


<t)^x)  '^^M.(x)  , i = 0,  1,2,  •••,<))°(x)  = <t.(x) 

VTir  " 

and  Mj^(x)  are  the  Hermite  polynomials 

Hi(x)  = J(-  j:(i-’2j)l  ^ x +h.  (i.2)X  ^ 


(2) 


(3) 


where  the  sum  is  taken  from  j = 0 to  the  longest  value  for  which  i-  2j>0.  Equation  (1) 
is  then 


c c c 

f(x)  = [c^  + yf  Mj(x)  +-^  M2(x)  +-^  M3(x)  + • • •]  <j,(x) 


(4) 


with 


.■?  *•••_  ..  *4 


iki 


r 


254 


COMMUNICATIONS 


LU 

c.  ^ / M.(x)f(x)dx 


(5) 


Because  £(x)  is  of  zero  mean  and  unit  variance,  while  its  higher  order  moments  are 
fji  c.  may  be  expressed  by 


^i  ^^i-2  ^^i-4 

■^^i,(i-2)  i-2  ■'■^i.(i-4)  i-4 


+ • 


V =1,2, 


(6) 


Denoting  by  r|^  the  semi- invariants  of  (|  - m)  and  by  r)'^  the  semi-invariants  of  m^ 
and  introducing 


’1  ’I  ' 

\ =—  and  V = — - 

V V V V 

(T  ff  j 


we  obtain 


f(x)  = <t.(x)  + A (-1) 

V=1 


(7) 


where  b ,,  is  a polynomial  in  , ,,  independent  of  n.  This  is  the  Edge- 

V,  v+2h  J v-n+J 

worth  series. 

Applying  the  Edgeworth  series  to  the  binomial  distribution,  one  should  keep  in 
mind  that  it  is  a discrete  distribution  while  f(x)  represents  a p.d.f,  , a continuous  dis- 
tribution. The  approximation  is  meaningful  in  terms  of  c.d.f.,  or  the  integral  replaces 
the  same  as  an  approximant.  Consider  the  binomial  distribution  with  the  f.f. 


„ , , m , k m-k 

Bp(k)  = ( ^ )P  q 


(8) 


The  first  nine  terms  of  the  Edgeworth  series  in  terms  of  powers  of  1 /cr  may  be  written 
as 

f(x)  = <t)(x) 


. Jjl^  4,l’>(x)l 
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u 

- ♦<’!«  ♦"'(X)  ♦'“>W  ♦l‘^>(x)  t 

O' 

*(*)„  , 

O' 


. -L[^+I’>,X,  * 

a 

^♦"”w 

, ,<‘°>„  4%^  ♦I‘«(X>  4 

O' 

<i,<^®>(x)  +-%f^  +%p  +^itr 


(9) 


The  coefficients  P are  independent  of  n and  are  polynomials  in  p.  The  moments 
m,  n 

may  be  evaluated  using  the  moment-generating  function 


m.  = G^^\s)|  _«,  where  G (s)  = E[e“'^] 

1 X I S-U  X 

It  is  convenient  to  write  the  moments  as 

m.  •A.p‘+B.(.)|^,(,.A.p‘  <■‘^’0,., 

where  A^(n)  = n(n-  1) [n-  (i-  1)]  . Finally,  in  a recursive  form 

"i«  +^■"1-  ‘®l"'|s=0  +S<»1U=0 

The  central  momenta  are  then 


sk-i 


Hk=  ^ 'r  ”^5  "\-r 

r=o 


(10) 


(11) 


(12) 


*»»  H 
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which  permits  us  to  find  ^ from  Equation  (6).  All  from  Pj^  ^ through  Pg^  24 

were  calculated.  The  tabulation  is  not  included  in  this  report  because  of  its  length; 
instead,  the  procedure  will  be  demonstrated  by  calculating  P2  From  Eqs.  (4),  (6) 
and  (9),  P2  ^ is 


where  [•]  that  part  of  the  bracketed  term  which  is  independent  of  n.  From  Eq.  (12)  and 
2 

the  fact  that  u = npq,  we  have 

^(mg-bmimg  +16m2m4)  . 
ff  npq 

The  m may  be  calculated  from  equation  (11).  They  are  of  the  form  m = n[**  •],  or 
r ^ 

f'-ll  ” 2 V 2 [*”6^2  ■ ^ 

(T  npq  npq 

where 


[m,]-  denotes  the  contribution  of  polynomial  in  p from  m^  of  the  form 


‘6^2 

^6^2 


f*^6^2  ” (polynomial  in  p independent  of  n).  Thus,  we  have 
[m,],  = n^  (274p‘*  - 750  p^  + 715p^  - 270  p + 3l] 


‘6^2 
Similarly 

[m^]^  = n(24p^  - 60  p^  + 50  p^  - 15  p +1),  or 
[^]  = 5(26 p^  - 26p  + 5) 

(T 

2 2 

By  the  same  procedure  ] =6p  - 6p+l,  obtaining 


'2,6  ■ ■ 6,4^^2J  ■ 1^4 


P,  ‘5(^1  =10(4pV-<ptll 


B.  Testing  the  Accuracy  of  the  Edgeworth  Series 

To  use  the  Edgeworth  series  for  the  binomial  distribution,  replace  <t)(x)  by 


<t>(k)  = 


exp 
a 2xr 


2 

^ ~ where  m = mp  and  a^=  mp(l  - p) 
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Let  B (k)  be  the  series  approximation  of  the  actual  B (k).  Introducing  as  a measure  of 
a p 

approximation 

I Dp(k)  I = I B^(k)/Bp(k)  - 1 I 

Figure  1 shows  the  error  for  p = . 1 and  m = 36  with  f in  {t  ) indicating  the  number  of 


Fig.  1. 

terms  in  the  Edgeworth  series.  The  figure  demonstrates  that  the  higher  order  Edge- 
worth  series  improves  the  approximation  in  two  ways:  improved  accuracy  and  larger 
trustable  range  of  k.  This  method  breaks  down  if  c < las  demonstrated  in  Fig.  2 for 
p = .01  and  m = 36.  In  the  latter  case  the  series  diverges.  To  remove  this  difficulty, 
a new  technique  is  presented  in  the  next  section. 


Even  if  one  uses  nine  terms  of  the  Edgeworth  series,  the  approximation  breaks 
down  for  c £ 1 . A powerful,  yet  simple,  technique  will  now  be  introduced  to  ensure  high 
accuracy  of  approximation.  Let  p^  = ap  and  = q/b,  then 


„ ,1  Y , ni . k ni”  k 

Bp(k)  = ( )P  q 


m-k 


, m . k m-k 

(k  )Piq 


= C B (k) 
Pi 


(13) 


where 


C = 


.m-k 

D 

k 


= exp[(m  - k)  in  b - kina] 
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By  a suitable  choice  of  a and  b,  i.e.,  a = ^ mp  q j"  > 1,  we  reduce  the  problem  to  find- 
ing an  accurate  expression  for  Bp^(k)  using  the  Edgeworth  series,  while  the  constant, 
C,  may  be  calculated  with  a high  degree  of  accuracy.  Thus,  by  an  appropriate  change 
in  Pj  and  q^,  we  obtain  an  accurate  expression  for  Bp(k)  for  every  k.  From  Fig,  1 it 
is  obvious  that  it  suffices  to  use  a four-  or  eight- term  Edgeworth  series  for  high  ac- 
curacy representation  in  the  neighborhood  of  the  mean.  In  general,  a good  rule  is  to 
use  pj  so  that  cTj  > 0,  then  TSAI  will  yield  highly  accurate  results.  The  results  are 
shown  in  Fig.  3,  where  TSAI  was  applied  to  p = .1  and  m - 36.  In  the  figure,  | D^Ck)  | 


VWA/:. 


. 0’4  05  *0.6  . 

I (4)  1 


U-''4  V 


z \l  0:5  \/ 

- • (8)  *0.6 
..-iqI  1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 1 I 1 1 1 1 1 1 I 1 1 1 j.i  I n 1 1 Lj-uJ 

” 4 8 12  16  20  24  28  30  34 

Fig.  3. 

was  plotted  for  different  choices  of  p^.  Similar  results  were  obtained  for  p = .01  and 
m = 36,  therefore  the  graphs  are  not  repeated. 

If  one  needs  high  accuracy  expressions  for  the  cumulative  distribution  function 


8pll')  = S Bp(kl 

k=o  ^ 

we  select  the  values  of  B (k)  from  the  appropriate  high  accuracy  intervals  for  B (k)  and 

P P 

add  them.  The  accuracy  of  this  method  was  tested  for  p = . 1 and  .01,  and  m = 36  by 

selecting  five  values  from  the  first  shift  interval  and  four  from  other  intervals.  The 

results  are  shown  in  Figures  4 and  5. 
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AUTOCORRELATION  FUNCTION  AT  THE  OUTPUT  OF  MEMORYLESS  NONLINEARITIES 
USING  A DIFFERENTIAL  EQUATION  APPROACH 

R.  Chassaing  and  L.  Kurz 

In  this  report,  a new  approach  is  presented  for  finding  near-optimum  detectors 
for  noise  distributions  represented  by  a mixture  of  Gaussian  and  impulsive  noise  and 
assuming  the  noise  processes  are  Markovian.  The  approach  concentrates  on  minimiz- 
ing the  output  noise  power  by  solving  the  appropriate  Fokker- Planck  equation  in  conjunc- 
tion with  the  Fletcher- Powell  search  procedure. 

The  autocorrelation  function  obtained  using  the  differential  equation  approach  is 
simpler  to  evaluate  than  using  the  standard  transform  method.  Furthermore,  the 
approach  of  this  report  can  be  extended  naturally  to  obtain  expressions  for  multi-dimen- 
sional p.d.f.  's,  spectral  density  functions  and  signal-to-noise  ratio. 

A.  The  Fokker- Planck  Equation  and  Its  Solution 

Using  the  notation  suggested  in  Reference  1.  In  finding  the  p.d.f. 's  associated 
with  a Markov  process,  one  may  use  the  solution  of  the  Fokker- Planck  equation  as  an 
expression  of  the  desired  p.d.f.  The  Fokker- Planck  equation  is  a second  order  dif- 
ferential equation 

aw(y^t)  ^9G(x,  t)  ^0  (1) 

O t O X 

where  W(x,t)  is  a p.d.f.  and 

G(x,t)  - K^(x,t)  W(x,t)  - ■|■|;[K2(x,t)  W(x,t) 

The  coefficients  Kg;t,x),  s = 1,2,  are  the  intensity  coefficients  defined  in  Ref.  1,  and 
depending  on  their  choices  various  Markov  noise  processes  may  be  desired.  If  the 
intensity  coefficients  do  not  depend  on  time,  then  the  p.d.f.,  W(x,  t),  approaches  its 
stationary  value,  W^^(x).  It  is  easy  to  show  that 


WAx)  = 


.X  K,(y) 


St'  ' K,{x) 


exp 


— I \ / / 


where  the  choice  of  A and  Xj^  must  satisfy 


(2) 


f W Jx)dx  = 1 . 

^ St 
- oo 

For  W .(x)  Gaussian,  K,(x)  = ax +b,  K,(x)=d  and  the  resulting  p.d.f.  is 
st  1 ^ 


r 


t 
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W^^(x)  = K exp 

r, 

(X+-) 

d 

* a 
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where 


K = l/JzTd/T 


To  describe  the  amplitude  distribution  of  the  impulsive  noise  component,  the 
corresponding  to  the  generalized  Cauchy  noise,  is  introduced 

W fx)  = 

(x  +h) 


(4) 


For  m = 1 we  have  the  Cauchy  distribution  and  for  m— 'oo  the  Gaussian  distribution.  The 
choice  of  m between  these  two  limits  represents  a measure  of  the  impulsiveness. 

Solving  Eqs.  (2)  and  (4)  yields 


Kj(x)  = ax,  K2(x)  = c{x'^  + h)  and  c = a/(l-m). 

A more  general  form  of  of  Eq.  (4)  may  be  obtained  by  the  choice  of  K,(x)  = ax' 


n-  1 


r 


and  K2(x)  = c.  For  n even,  we  obtain 


K 


^st<")=  , n 


exp{2  f 
c(x“  +h)  h 


x”  +h 


a du^  n , u 

— - — } u = X +h 
cn  u 


? exp{in(x" in(h)^^/‘'”} 


c(x“  +h) 
which  reduces  to 

B 


W .(x)  = — 
sl  / n 


(X  +h) 


l-2a/ 


cn 


(5) 


(6) 


(V) 


where 


B = 


Kh 


- 2a/ c 


Note  that  Eq.  (7)  reduces  to  Eq.  (4)  for  n = 2. 

Letting  W(x,t)  = X(x)  T(t)  in  Eq.  (1),  dividing  by  W(x,t)  and  introducing  a constant 
X = T(t)/T{t), 


3 X 


(8) 


«4 
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The  solution  of  Eq.  (8)  consists  of  a sequence  of  eigenfunctions  X^,  Xj,  X^,  • • • , with 
corresponding  eigenvalues  ^q»  ^2  ’ ^2' " ' ' ‘ shown  that 


(\  - \ ) fx  X 

m n m n 


Wst(x, 


if  the  boundary  conditions 
G[X]  = G[W^J  0 


are  satisfied,  where 


The  orthogonality  of  the  eigenfunctions  with  respect  to  the  weight  function  l/W^^(x) 

follows  from  Eq.  (9)  for  \ / \ , (m  r n).  Furthermore,  each  eigenfunction  is  nor- 

^ m n 

malized  such  that 


f ^ 

''  n 


2 dx 


W^^(x) 


n = 0,  1. 2,  • • 


13,  Solution  of  the  Fokker- Planck  Equation  for  the  Gaussian  Case 
Letting  Kj(x)  = ax,  K2(x)  = -2aa^  = d in  Eq.  (8) 

X"(x)  - xX'(x)  +-|  (X.- a)X(x)  ^ 0 (13) 

d d 

The  solution  of  Eq.  (13)  about  the  ordinary  point  x = 0 is  of  the  form 

X(x)  = ^ A^x'’  (14) 

n=o 

with  two  arbitrary  constants  Aq  and  Aj.  A^  can  be  found  by  substituting  Eq.  (14)  into 
Equation  (13).  A solution  of  the  form 

X(x)  = 

n=o 

can  be  assumed,  or  Eq.  (14)  can  be  used  to  generate  Equation  (15).  Expanding  Eqs. 

(14)  and  (15),  equating  like  coefficients,  the  coefficients  can  be  found  in  terms  of  the 

A , where 
n' 

A = - ) n>2  (16) 

n d ' n(n  - 1)  ~ 

2 

The  B can  be  written  with  d = -2acr 
n 


14 
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B4  = {(V+2a)} 

®6  " ef  ■f)^{(^+2aHX+4a)} 

Bi=Ai 

B3  = 3^(-|){X+a} 

A 

®5  " sf  <■  |)^{(’^+a)(X+3a)} 


The  eigenvalues  can  be  found  by  inspection  from  the  B 's. 


^2  = 

X4  = -4a 


a < 0 


(17) 


(18) 


Using  Eq.  (17)  in  Eq.  (15),  the  eigenfunctions  corresponding  to  the  eigenvalues  of  Eq. 
(18)  are. 
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X„w3{l)W,,(x) 

<J 

O'  3cr’ 

: (19) 

x,M=:^{x)w^,(xi 

Xjlx)  = {x  - -^}  W^^(x) 


where  W^^(x)  is  the  Gaussian  p.d.f.  Rearranging  Eq.  (19)  yields  the  eigenfunctions  in 
terms  of  Hermite  polynomials,  or 

<x)W  (x)  ,20) 

n 

where  C is  such  that 
n 

GO 

/ w (x)dx  = 1 
-oo 

is  satisfied.  The  boundary  conditions,  Eq.  (16),  were  satisfied  for  x = too.  If  the 
boundary  condition  G[X^]  = 0,  n odd,  is  applied  for  x > 0,  the  arbitrary  constant  Aj  is 
zero,  and  the  solution  is  expressible  in  terms  of  even  polynomials  only.  The  differential 
Eq.  (13)  has  no  singularity  in  the  finite  plane,  hence  its  solution  is  valid  for  all  finite 
values  of  x. 

C.  Solution  of  the  Fokker-Planck  Equation  for  the  Generalized  Cauchy  Density  Function 

Using  a similar  procedure  as  for  the  Gaussian  case,  substituting  Kj(x)  and 
which  satisfy  Eq.  (4)  into  Eq.  (16),  the  Fokker-Planck  equation  for  the  generalized 
Cauchy  becomes 

X''(x)  +p(x)X'(x)  +q(x)X(x)  =0  (21) 

where 

p(x)  = 2x(2  - J)/(x^  f^) 
q(x)  = 2(1  - ■^)/(x2  +^) 


- = h > 0 
c 


(22) 
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This  second  order  differential  equation  has  two  regular  singular  points  at  x = 
in  the  finite  complex  plane.  The  solution  of  Eq.  (22)  yields  the  following  eigenfunctions 
and  eigenvalues. 

The  eigenfunctions  are 

Xq(x)  = a:!}  W^^(x) 

^2  2 

X2(x)  = A{  1 - X } W^^(x) 


X^(x)  = A{  1 - X + 


4 2 ~ ^2^  4 


6d‘ 


X } W^^.{x) 


X, (x|  = A { 1 - t _6_1^  ^4  . « 2”  6 4.  ^6 , „ 

^ 6d^  ^ 6.15d^ 


Xj(x)  = B{x}  W^^(x) 

3 

X3(x)  = B {x x } W^^{x) 


(23) 


Xc(x)  = B{x r-3 X + = X } W (x) 

^ 3.10d‘^ 


with  corresponding  eigenvalues, 

Xq  =0 

X^  = -(c  +2a) 

X^  = -(6c  +4a) 

X^  = -(15c  +6a) 


X^  = - (3c  + 3a) 
X.  = -(10c  +5a) 

D 

= -(21c  +7a) 


(24) 


Since  the  differential  equation  for  the  Cauchy  case  has  only  complex  singularities 
in  the  finite  plane,  its  solution  is  valid  for  all  finite  values  of  real  x,  or 


<>Jx)  W^^(x) 


(25) 
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where  <!>^(^)  orthonormal  polynomials  of  the  generalized  Cauchy  p.d.f. 

The  boundary  conditions,  Eq.  (10),  imposed  on  each  will  be  satisfied  depending 
on  the  value  of  m.  Specifically,  Ci[X^]  = 0 for  x = t oo  with  n such  that  the  moments  of 
the  generalized  Cauchy  p.d.f.  are  finite.  Note  that  as  c — 0 (h  — oo),  the  set  of  eigen- 
values, Eq.  (24),  reduces  to  the  eigenvalues  of  the  Gaussian  process,  and  the  eigen- 
functions are  expressed  in  terms  of  the  Hermite  polynomials.  Such  results  should  be 
expected  since  Kj(x)  and  K2(x)  wbuld  then  become  the  intensity  coefficients  of  the 
Gaussian  process. 

D.  Evaluating  the  Autocorrelation  Function  Using  the  Differential  Equation  Technique 


The  development  of  the  differential  equation  technique  applied  to  a Markov  process 
may  be  used  to  find  such  important  results  in  the  analysis  of  communication  systems  as 
the  autocorrelation  function  at  the  output  of  a nonlinearity,  the  multidimensional  p.d.f., 
the  spectral  density.  The  two-dimensional  p.d.f.  can  be  expressed  in  terms  of  the 
eigenfunctions  found  before. 


j=o 


X.(x)X.(xQ)e 


(26) 


from  which  the  autocorrelation  function  can  be  obtained. 


R (t)  = X h*:  e 

y j = l J 


J 


-\ J T 


(27) 


where  the  coefficients  h^  depend  on  the  nonlinearity  used  and  the  eigenfunctions  associat- 
ed with  the  noise  affecting  the  system. 


h = /°°F(x)X  (x)dx 


(28) 


for  even  polynomials. 

The  nonlinearity  of  Fig.  1 is  of  a general  form.  Depending  on  the  values  of  a^ 

a^  and  a^,  such  nonlinearity  may  yield  several  different  limiters  frequently  used  in  the 

analysis  of  communications  systems,  such  as,  the  "hard"  limiter  (a^,  —■  oo),  the 

window  limiter  (aj=0,  the  triangular  limiter  (a2=aj).  Furthermore,  such 

nonlmearities,  while  providing  a good  approximation  to  the  optimum  nonlinearity  dis- 

2 

cussed  by  Kurz,  may  be  more  easily  implemented. 

Applying  the  differential  equation  techniques,  a computer  program  was  developed 
to  find  a nonlinearity  of  the  form  of  Fig.  1 which  would  minimize  the  power  of  the  noise 
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Fig.  1.  Nonlinearity  used  in  suboptimum  detectors. 

R^(0),  under  both  noise  conditions,  the  Gaussian  and  the  generalized  Cauchy  noise. 
Simulations  conducted  under  different  noise  conditions  showed  that  the  output  power  of 
the  noise  increases  as  a^  and  a^  increase.  It  is  worth  noting  here  again  that  the  gener- 
alized Cauchy  p.d.f.  used,  with  m as  a parameter,  yields  the  Gaussian  p.d.f.  (m  — oo), 
the  Cauchy  p.d.f.  (m  = 1),  with  different  degrees  of  noise  impulsiveness  characterized 
by  different  values  of  m.  The  triangular  nonlinearity  yields  the  smallest  R^(0). 

For  the  linear  case,  the  analysis  becomes  simple  since  the  coefficients  h^,  j>  1 
in  Eq.  (27)  vanish,  yielding  for  the  Gaussian  case 


-d 


R (t) 

y 


2 2o-‘ 

O’  e 


d > 0 


and  for  the  generalized  Cauchy  distribution 


(29) 


R (t)  = ^2im:  1/2J  ^-c(m-  1)|t| 
y (m-  3/2) 


(30) 


2 

where  the  parameters  m,  c,d,(r  , were  described  previously.  Note  that  R (0)  in  Eq. 
(30)  decreases  as  the  noise  becomes  less  impulsive  (larger  m),  and  reduces  to  Eq.  (29) 


as  m — oo. 


E.  Optimizing  the  Nonlinearity  of  Fig.  1 Using  the  Fletcher- Powell  Search  Procedure 

Determining  the  most  efficient  nonlinearity  of  the  form  of  Fig.  1,  by  varying  the 
parameters  a^a^  and  a^  is  a brute  force  method.  Instead,  a formal  search  procedure, 
suitable  for  the  optimization  problem,  is  developed  here.  The  proposed  procedure  is 
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a special  application  of  the  Fletcher  and  Powell  method  of  References  3 and  4.  It  per- 
forms the  calculation  of  an  unconstrained  minimum  of  a function  of  several  variables. 
(The  procedure  of  Ref.  3 can  also  be  used  to  handle  constraints  on  the  system.) 


I 

i 


The  criterion  of  performance  is  the  minimization  of  the  noise-to- signal  ratio, 
N/S,  at  the  output  of  the  nonlinearity.  The  case  of  a constant  signal  A is  considered, 
with  the  noise  present  characterized  by  the  generalized  Cauchy  p.d.f.  The  generalized 
Cauchy  p.d.f.,  Eq.  (4),  can  be  written  in  the  case  of  a shift  in  the  mean  as 


p(x-  A) 


K 

((X- A)^  + h)'^ 


(31) 


A program  was  used  to  carry  out  the  optimization  problem.  Using  Ref.  3,  the 
gradient  functions  (partial  derivatives  of  N/S)  with  respect  to  ^3  evaluat- 

ed. 


Basically,  the  Fletcher- Powell  procedure  is  an  iterative  process.  The  direction 
of  the  optimal  solution  is  provided  depending  on  the  algebraic  sign  of  the  gradient  func- 
tions. The  optimum  solution  is  achieved  (minimization  of  N/S)  when  the  gradient  func- 
tions with  respect  to  ^j»^2  ^3  ^6*^000  to  zero.  Under  such  conditions,  the  optimum 

nonlinearity  is  obtained  with  the  optimal  values  a^^,  i = 1,2,  3 found  which  minimize  N/S. 

The  results  are  plotted  in  Figures  2 through  7.  Note  again  that  the  signal-to-noise 
ratio,  S/N,  was  jointly  optimized  with  the  nonlinearity,  under  each  noise  condition. 

The  simulation  results  show  that  a^.a^  and  a^  increase  as  the  noise  becomes  less  im- 
pulsive, showing  an  increase  in  the  linear  region  of  the  limiter,  and  providing  at  the 
same  time  less  clipping.  As  the  noise  becomes  less  impulsive  (larger  m or  smaller 
O'  ),  the  output  S/N  increases  as  the  signal  strength  A increases.  It  leads  one  to  con- 
clude that  the  nonlinearity  performs  efficiently  under  various  input  S/N  ratios. 
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A UNIQUE  FREQUENCY  DIVISION  MULTIPLEX  COMMUNICATION  SYSTEM* 

L.  Kurz  and  J.  Reed 

In  this  report,  a brief  description  of  a unique  FDM  system  is  given.  Details  of  its 
performance  and  a comparison  study  with  an  existing  system  are  given  in  Reference  1. 

For  the  actual  FDM  system,  basic  parameters  and  architecture  for  a telecom- 
munications switching  system  using  radio  frequency  carriers  on  a coaxial  cable  in  a 
ring  main  form  are  developed.  The  resulting  distributed  system  achieves  transmission 
by  means  of  single  sideband  modulation.  It  also  achieves  four  wire  connectibility  by 
transmitting  either  the  main  or  image  signal  generated  in  mixers,  using  the  fact  that 
the  information  passband  is  limited  and  filtering  may  be  applied. 

Similar  systems  have  been  proposed  and  built  previously.  In  both  cases  dialing 
is  accomplished  by  frequency  setting  using  digitally  controlled  synthesizers.  Such  a 
system  had  a receiving  system  using  direct  (synchronous)  conversion  directly  to  audio. 
This  report  proves  that  such  a restriction  is  not  necessary  and  that  the  same  or  better 
results  can  be  achieved  by  using  a more  general  formulation  involving  both  transmitting 
and  receiving  intermediate  frequencies  different  from  zero. 

A further  result  of  removing  the  restriction  and  receiving  IF  is  that  it  becomes 
readily  possible  to  use  independent  sideband  transmission.  This  provides  signalling 
channels  for  improving  the  system  performance,  its  connectibility  and  flexibility. 

This  report  compares  both  of  these  systems  by  developing  the  frequency  plan  for 
each  and  demonstrating  that  the  original  system  is  a special  case  of  the  one  developed 
herein.  The  system  architecture  is  demonstrated  in  the  form  of  schematics  and  block 
diagrams. 

An  investigation  is  made  of  the  resulting  system  to  develop  further  the  design 
rules  and  to  demonstrate  the  practicality  of  the  proposed  scheme.  Sources  of  noise 
are  enumerated  and  examined,  in  some  detail,  in  order  to  show  the  limits  presently 
achievable  in  signal- to-noise  performance.  Intermodulation  products  are  enumerated 
and  evaluated  by  the  use  of  a modified  form  of  Beniiet's  method. 

Formulas  are  derived  for  choosing  frequencies  based  upon  the  number  of  sub- 
scribers requiring  service  and  the  achievable  transmission  bandwidth.  Examples  of  the 
use  of  these  formulas  are  given  showing  the  limitations  on  inter  modulation  terms 
generated  in  terms  of  cross-over  products. 

Sensitivity  calculations  are  made  to  demonstrate  the  absence  of  need  for  broad- 
band amplifiers  with  their  unavoidable  nonlinearities,  leading  to  poorer  IM  performance. 

Patents  involving  the  system  summarized  here  are  pending. 


*1 
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Power  outputs  from  transmitters  are  shown  to  be  those  normally  used  in  FDM  systems. 
Sensitivity  calculations  account  for  cable  losses,  coupler  losses,  power  loss  due  to 
multitude  subscribers  and  to  cable  terminations. 

The  special  advantages  of  the  more  general  system  described  is  elucidated  by  an 
examination  of  the  signalling  and  supervisory  problem  which  was  not  satisfactorily 
solved  in  the  previous  system.  The  conferencing,  override  and  pre-empt  problems 
posed  by  this  type  of  switch  are  each  examined  and  several  alternate  solutions  to  these 
problems  are  proposed. 

In  each  case  the  solution  to  the  problems  of  signalling  are  clarified  by  proposing 
algorithms  for  designing  the  logic  system  to  be  used  in  making  the  various  forms  of 
connections.  These  algorithms  are  given  in  terms  of  single  channel  dialing  sequences. 
Discussions  of  problems  which  arise  in  these  sequences  are  provided  together  with 
proposed  methods  of  overcoming  each  problem.  These  solutions  are  further  clarified 
by  means  of  block  diagrams  demonstrating  proposed  subscriber  sets. 

Indications  are  given  where  further  profitable  investigations  both  theoretic  and 
experimental  are  needed. 

The  conclusion  drawn  is  that  the  general  system  formulas  lead  to  a superior  design 
solving  several  previously  unsolved  problems  . 

ITT -Military  Electronic  System  L.  Kurz  and  J.  Reed 
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ON  CHANNEL  MONITORING  TECHNIQUES  IN  THE  PRESENCE  OF  IMPULSIVE 
NOISE  OR  FADING  EFFECTS 

1.  M.  Habib  and  L.  Kurz 

The  robust  p.  d.  f.  estimation  procedures  presented  by  the  authors  in  the  pre- 
vious report^  are  used  to  obtain  channel  monitoring  estimators  which  are  used  to 
identify  whether  the  noise  in  the  communication  channel  is  Gaussian,  a mixture  of 
Gaussian  and  impulsive  noise,  or  the  channel  operates  in  a fading  mode.  The  estima- 
tors presented  here  are  helpful  in  developing  adaptive  detectors  in  a non-Gaussian 
noise  environment. 

A.  Some  Representations  of  Impulsive  Noise 

Impulsive  noise  is  generally  characterized  by  pulses  of  short  duration  and  large 
amplitude  which  corrupts,  to  a certain  extent,  almost  every  transmission  path. 
Therefore,  it  is  important  that  detection  procedures  be  optimized  to  detect  a signal 
corrupted  by  a mixture  distribution.  Such  procedures  were  developed  in  the  literature 
based  on  various  mathematical  representations  of  the  impulsive  noise. 

This  section  is  aimed  to  serve  as  a review  of  the  various  impulsive  noise  repre- 
sentations used  in  the  literature. 

An  impulsive  noise  model  that  has  been  used  successfully,  assumes  that  the 

impulsive  noise  consists  of  very  narrow  pulses  (approximating  impulses)  occurring 

randomly,  where  the  frequency  of  pulse  occurrence  is  governed  by  certain  distribu- 

2 

tion.  This  model  was  used  in  the  early  work  of  Kurz  for  detecting  a signal  in  addi- 
tive Gaussian  and  impulsive  noise,  with  the  frequency  of  pulse  occurrence  assumed 
to  follow  a Poisson  distribution 


f(n,  t) 


- (aT)‘ 
■ ni 


-aT 


(1) 


where  n is  the  number  of  pulses  in  an  interval  T and  a is  the  long  term  average  of  the 
number  of  pulses  occurring  per  unit  time.  In  this  model,  the  distribution  of  amplitudes 
was  assumed  to  be  of  large  variance  but  otherwise  unspecified.  The  Poisson  distribu- 
tion of  occurrence  of  noise  impulses  can  be  easily  generalized  by  replacing  Eq.  (1) 
with  a type  B Gram- Charlier  expansion. 

A different  representation  of  the  impulsive  noise  assumes  that  the  noise  process 
is  represented  by  an  independent  increments  process,  where  the  p,  d.  f.  associated 
with  the  sample  amplitude  of  the  process  may  be  given  by  a Cauchy  amplitude  distri- 
bution, whose  p.  d.  f.  is  given  by 


r 


276  COMMUNICATIONS 

X + X 

3 4 

where  X is  a specifying  parameter  . This  model  was  also  used  by  Kurz  and  Kapp 

for  evaluating  the  performance  of  suboptimum  detectors  in  Gaussian  and  impulsive 

noise.  In  general,  this  model  is  used  when  the  impulsive  noise  is  severe. 

Alternately,  an  exponential  amplitude  distribution,  with  p.  d.  f,  given  by 


f{x)  = ae 


-bur 


- 00  < X < 00 


where  a,  b and  n,  specify  the  impulsiveness  of  the  noise,  is  a useful  representation 
for  a large  class  representing  a weak  impulsive  noise. 

B.  Gaussian  and  Weak  Impulsive  Noise  Channel  Estimator 

The  communication  channel  in  this  section  is  assumed  to  be  corrupted  by 
Gaussian  and  weak  impulsive  noise.  The  Gaussian  noise  is  given  by 


f(x)  = 


where  p and  cr  are  the  average  and  standard  deviation  of  the  distribution  respectively. 

The  weak  impulsive  noise  belongs  to  the  exponential  class  and  is  given  by  the 
double- exponential  distribution* 


£ p-a|x| 


- 00  < X < 00 


where  o is  a specifying  parameter  of  the  distribution.  The  distribution  of  the  noise 
in  the  channel  is  a mixture  of  the  Gaussian  and  impulsive  noise  with  the  c.  d.  f.  given  by 

F(x)  = (1-6)  Fj(x)  + 6 F2{x)  00  - < X < 00  (6) 

where  F (x)  is  the  c.  d.  f.  of  the  mixture  distribution 
F^(x)  is  the  c.  d.  f.  of  the  Gaussian  noise 
F^lx)  is  the  c.  d.  f.  of  the  impulsive  noise 
6 is  the  mixing  parameter. 

For  this  class  of  communication  channels,  the  noise  p.  d.  f.  will  be  represented  by  a 
series  expansion  of  the  orthogonal  Hermite  functions,  i.  e,  , 

{(n(t.))=f  a(/)(n(t))  (7) 

J i = l •' 

*Similarly,  one  may  proceed  with  the  more  general  distribution  with  the  amplitude 
p.  d.  f.  given  by  Equation  (3). 


*li*  *4 
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where  a.  {i  = l, r)  are  the  expansion  coefficients 

0.(u),  (i=l ,r)  are  the  Hermite  functions  given  in  Reference  1 

n(t.)  are  the  noise  samples  (the  output  of  a sampler) 

The  expansion  coefficients  are  given  by 

a.  = f (i). (u)f(u)du  (®) 

1 1 
u 


A recursive  estimation  procedure  for  estimating  the  a^'s  using  the  Robbins-Monro 
procedure  has  been  developed  in  Reference  1. 

The  resulting  recursive  equation  for  estimating  a^  is  given  by 


(k+1)  _ ^ (k) 
a.  - a. 

1 1 


Yk 


a. 

1 


(k) 


(9) 


where  is  a gain  coefficient  of  the  form  1 /k 

(k) 

As  the  number  of  the  processed  noise  samples,  increases,  a^  approaches 

a.  leading  to  a better  estimator.  This  noise  estimator  is  actually  the  same  estimator 
presented  in  Ref.  1 for  the  Gaussian  class  where  the  random  variable  is  replaced 
here  by  the  random  noise  sample  n{tj^). 

A block  diagram  that  is  used  to  implement  this  estimator  is  shown  in  Figure  1. 
The  noise  process  in  the  channel  is  sampled  every  ' t seconds  and  the  samples  are 
processed  by  the  p.  d.  f.  estimator. 


Fig.  1.  Channel  noise  estimator  for  channels  with  Gaussian 
and/or  weak  impulsive  noise. 

The  sampling  time  ZSt  is  chosen  based  on  the  type  of  transmitted  signals.  For 
example,  if  the  signals  transmitted  consist  of  a sequence  of  pulses,  each  of  duration 
T,  then  the  sampling  maybe  made  over  a time  period  (lOT)  at  a rate  (T/lOO)  seconds. 
To  ensure  that  the  samples  used  in  the  estimation  procedure  are  independent,  the 
samples  generated  from  the  first  time  period  are  interleaved  with  those  generated  in 
the  second,  third, ...  and  tenth  time  periods.  This  interleaving  results  in  samples 
that  are  i.  i.  d.  and  suitable  for  processing  by  the  p.  d.  f.  estimator. 

Upon  identifying  the  channel  noise,  the  detection  path  decision  device  selects 
the  appropriate  detection  procedure.  It  is  to  be  noted  that  this  estimator  is  adaptive 
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to  the  type  of  noise  in  the  channel  without  changing  any  parameters  of  the  estimator, 
a particularly  useful  characteristic  in  practical  systems.  The  results  of  computer 
simulation  of  this  estimator  for  a mixture  of  (0,  1)  Gaussian  noise  and  a double- 
exponential distribution  {a=.  5)  is  shown  in  Figs.  2,  3 and  4 for  mixing  parameter 
values  6=0,  .1,  and  1 respectively.  The  range  of  the  mixing  parameter  values  used 
in  the  simulation  allow  the  examination  of  the  performance  of  the  estimator  under  all 
noise  conditions  assumed  in  the  channel.  The  graphs  in  Figs.  2,  3 and  4 show  the 
closeness  of  the  noise  estimate  to  the  exact  distribution  and  therefore  its  practical 
usefulnes-s.  However,  for  a purely  impulsive  noise  (6  = 1),  more  samples  are  required 
to  estimate  the  cusp  of  the  p.  d.  f. 


Fig.  2.  Gaussian  and  weak  impulsive  noise  channel  estimator. 

This  estimator,  in  other  words,  is  a robust  estimator  for  the  class  of  communi- 
cation channels  corrupted  by  a mixture  of  Gaussian  and  weak  impulsive  noise,  where 
robustness  is  defined  in  the  sense  of  Reference  1, 


channel  estimator.  noise  channel  estimator. 
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C.  Gaussian  and  Severe  Impulsive  Noise  Channel  Estimator 

The  connmunication  channel  considered  in  this  section  is  assumed  to  be  corrupted 
by  a mixture  distribution  with  severe  impulsive  noise  whose  frequency  of  occurrence 
in  time  is  given  by  a Poisson  distribution 


f(n,  t)  - 


as  one  component  and  a sample  from  the  Gaussian  noise  process  as  the  other  com- 
ponent. For  this  class  of  communication  channels,  the  model  of  the  impulsive  noLse 
does  not  assume  any  amplitude  distribution.  Therefore,  the  channel  noise  estimator 
for  this  class  will  not  estimate  the  p.  d.  f.  of  the  impulsive  noise  directly,  but  will 
rather  estimate  the  p.  d.  f.  of  a specific  nonlinear  functional  of  the  impulsive  noise. 
This  is  especially  useful  since  in  general,  the  detection  of  signals  in  impulsive  noise 
requires  nonlinear  processing  of  the  received  signal  due  to  the  presence  of  the 
impulsive  noise. 

A nonlinear  device  that  will  be  used  for  processing  the  received  noise  is  shown 

in  Figure  5.  This  device  consists  of  a soft  limiter  followed  by  an  integrator,  and  a 

sampler.  The  sampled  noise  is  processed  simultaneously  by  two  p.  d.  f.  estimators 

followed  by  a threshold  decision  device.  It  is  assumed  that  the  received  Gaussian 

noise  level  will  not  exceed  the  level  A.  This  level  could  be  adjusted  periodically 

2 

based  on  the  estimation  of  the  Gaussian  noise  variance,  (r  , in  the  channel  and  setting 
A =4(r.  In  addition,  the  level  of  the  impulsive  noise  in  the  channel  will  always  exceed 
A.  The  nonlinear  device  will  pass  the  Gaussian  noise  with  minimal  corruption,  but 
will  transform  the  impulsive  noise  to  a series  of  pulses,  each  of  level  A and  their 
occurrence  in  time  follows  a Poisson  distribution. 


Fig.  5.  Channel  noise  estimator  for  channels  with  Gaussian  and/or 
severe  impulsive  noise. 
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The  output  of  the  nonlinear  device  is  integrated  over  a time  period  T which 
could  be  selected  according  to  the  type  of  signals  transmitted.  For  example,  in 
digital  transmissiion,  T could  be  chosen  as  the  inverse  of  the  pulse  rate. 

In  summary,  the  operation  of  the  soft-limiter-integrator  is  as  follows; 

An  integrator  for  the  Gaussian  noise. 

A hard-limiter  followed  by  an  integrator  for  the  impulsive  noise. 

In  impulsive  noise,  the  p.  d.  f.  of  the  detector  integrator  is  the  same  class  as  in 
Reference  1.  The  p.  d.  f.  of  the  amplitude  of  the  clipped  and  integrated  impulsi'-e 
noise  is  given  by 


-kT 


f(Y. 


2 

-k 


io(k 


- Y^/T)^ 


®(Y^-T)  + ®(Y^+T) 


= 0 


lY^I  < T 


|Yt,|  > T 


(11) 


where 

Iq  and  Ij  are  the  modified  Bessel  functions  of  the  first  kind,  order  zero  and 
one,  respectively, 

and 

k = aT. 


In  Gaussian  noise,  the  output  of  the  integrator  is  a random  variable  whose 
average  and  standard  deviation  are  functions  of  the  mean,  the  autocorrelation  function 
of  the  input  Gaussian  process  and  the  integration  period  T. 

On  the  assumption  that  the  noise  is  additive,  the  output  of  the  integrator  is 
given  by 

n(t)  = (1  - 6)  nj(t)  + 6 n^(t) 

where 

2 

nj{t)  is  a ()i,  a ) Gaussian  noise 

02(1)  is  impulsive  noise  whose  p.  d.  f.  is  given  by  Eq.  11 
6 is  the  mixing  parameter. 

The  output  of  the  integrator  is  sampled  every  At  seconds  where  At  and  the  sampling 
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period  are  chosen  as  in  Section  B to  ensure  independency  between  samples.  The 
resulting  samples  are  then  used  to  estimate  the  p.  d.  f.  of  the  noise  in  the  channel. 

The  noise  samples  are  processed  by  two  different  p.  d.  f.  estimators.  The 


first  estimator  is  of  the  form 


f(n(t.))  = 2 

■'  i -- 1 ■’ 


a.(i  = l, r)  are  the  expansion  coefficients 


rf).(u)  (i  1 r)  are  the  Hermite  functions. 


The  coefficients  a^  are  given  by 


a.  - I (f).(u)f(u)du 
1 1 
u 

and  are  estimated  recursively  according  to  the  following  equation 


The  second  estimator  is  of  the  form 


f(n(t.))  = ' b.  4J.(n(t.),  c) 

J i“  1 * ‘ J 


where 


b.  (i-1,....  1)  are  the  expansion  coefficients 

ip.  (u.c),  (i  = l ,1)  are  the  prolate  spheroidal  wave  functions. 


The  expansion  coefficients  b/s  are  given  by 
b^  = Jf(u)  il<j(u,  c)du  (i  = l,. 


These  coefficients  are  estimated  recursively  using  the  Robbins-Monro  procedure. 
The  recursive  equation  is  given  by 


b."'*"  = b."'' * 


- b/'"' 


As  the  number  of  the  processed  noise  samples,  increases,  the  estimate  b. 
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approaches  bj  and  a better  p.  d.  f.  estimation  is  obtained.  The  choice  of  the  proper 
noise  p.  d.  f.  is  based  on  calculating  the  statistic  S given  by 

S = J f (u)du  ( 1 8) 

for  each  of  the  estimators  and  choosing  the  one  which  gives  a larger  S. 

This  procedure  was  simulated  for  a mixture  of  a severe  impulsive  and  Gaussian 
noise  where  the  output  of  the  integrator  is  given  by  Equation  (11).  The  results  of  the 
computer  simulation  is  shown  in  Figs.  6,  7 and  8 for  various  mixing  parameters 
6-0,  .1,  1.  The  values  of  the  statistic  S for  the  various  6's  are  shown  in  Table  I. 
These  results  indicate  the  good  performance  of  the  statistic  S as  a channel  condition 
estimator,  especially  in  the  limiting  cases  and  that  this  estimator  performs  well  under 
any  mixture  of  Gaussian  and  severe  impulsive  noise.  For  a pure  impulsive  noise,  the 
estimator  of  the  form  Eq.  (15)  is  the  proper  one.  For  a pure  Gaussian  noise,  this 
estimator  has  very  poor  performance,  however,  the  proper  p.  d.  f.  is  given  by  the 
estimator  of  the  form  Equation  (12).  For  mixing  parameter  values  between  0 and  1, 
the  better  estimator  depends  on  the  dominant  noise.  The  statistic  S will  be  the  basis 
of  the  selection  of  one  estimator  versus  the  other. 

t(ntV) 


Fig.  6.  Gaussian  and  severe  impulsive  noise  channel  estimator. 
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TABLE  I.  Values  of  the  statistic  S for  various 

mixtures  of  Gaussian  and  impulsive  noise. 


S 


Hermite 

■ 

functions 

P.  S.  W.  F. 

0 

0.  999 

0.  92 

0.  1 

0.  968 

0.  92 

1.  0 

0.  655 

0.  99 

A different  approach  for  selecting  the  proper  orthogonal  set  is  based  on 
estimating  the  first  coefficient  in  both  the  series  expansion,  Equations  (12)  and  (15). 
The  resulting  estimates,  aj  and  bj,  are  compared  and  the  orthogonal  set  yielding  a 
higher  value  for  the  first  term  is  selected  for  estimating  the  noise  p.  d.  f.  This 
procedure  is  justified  on  the  basis  that  if  the  proper  orthogonal  set  for  the  underlying 
distribution  is  selected,  then  the  first  term  in  the  series  expansion  contributes  most 
to  the  estimated  p.  d.  f.  Higher  order  terms  have  less  contribution  and  may  be  con- 
sidered to  have  only  a smoothing  effect.  This  procedure  is  attractive  because  it  is 
less  expensive  than  the  previous  one  and  may  be  easily  implemented  in  a practical 
communication  system. 

Figure  9 shows  a block  diagram  that  implements  this  procedure  for  estimating 
the  channel  noise  condition.  The  results  of  estimating  the  first  coefficient  for  a 
Gaussian  and  impulsive  noise  using  both  the  P.  S.  W.  F.  and  the  Hermite  sets  is  shown 
in  Table  II. 


Fig.  9.  Channel  noise  estimator  for  channels  with  Gaussian 
and/or  severe  impulsive  noise 
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TABLE  II. 


Distribution 

using  Hermite 

using 

functions 

P.  S.  W.  F. 

Gaussian  noise 

0.  531 

0.  49 

Impulsive  noise 

0.  14 

0.  2447 

D.  Gaussian  and  Fac^ing  Channel  Noise  Estimator 

The  communication  channel  in  this  section  is  assumed  to  be  in  one  of  two  states, 
G or  B.  In  the  G state,  it  is  assumed  that  the  channel  is  corrupted  by  Gaussian 
noise  of  the  form  Equation  (4).  In  the  B state,  the  channel  is  assumed  to  be  in  a 
fading  state  with  amplitude  distribution  of  the  one-sided  class  (e.  g.  lognormal, 
Rayleigh  or  Beta),  A block  diagram  of  the  estimator  that  will  be  used  in  identifying 
the  channel  noise  condition  is  shown  in  Figure  10. 


- I fin)  dn 


Fig.  10.  Channel  noise  estimator  for  channels  with  Gaussian 
or  noise  due  to  fading. 

The  samples  of  the  channel  noise  are  processed  by  two  p.  d.  f.  estimators.  The 
first  estimator  is  of  the  form  Eq.  (7)  where  the  samples  are  processed  according  to 
the  recursive  Eq.  (9)  and  the  resulting  p.  d.  f.  estimate  is  denoted  by  fj(n). 

The  second  p.  d.  f.  estimator  is  of  the  form 

q 

•f(n{t.))  = y c.L.(n(t  ))  (19) 

J i^l  ' ^ J 

where 

c. (i=  1 .... . r)  are  the  expansion  coefficients 
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L.  (u)  (i  = l *■)  the  Laguerre  polynomials  defined  in  Reference  1 

n(t.)  are  the  noise  samples  (obtained  at  the  output  of  the  sampler). 


The  coefficients  c^  are  given  by 


c.  = f f(u)L.(u)du  I 

^ u 

A recursive  procedure  has  been  developed  for  this  class  of  estimators  in  Ref,  1 to 
estimate  the  coefficients  c..  The  resulting  recursive  equation  is  given  by 


- 'i"'' 


where 


Yj^  is  a gain  coefficient  of  the  form  1 /k 
The  second  p.  d.  f.  estimator  processes  the  samples  according  to  Eq.  (21)  and 

A 

the  resulting  p.  d.  f.  estimate  is  denoted  by  f 2(”)* 

The  statistic  S defined  by 

S.  = /f.(u)du  (2 

J J 


where  j = l or  2,  is  calculated  for  the  first  (j  = l)  estimator  and  the  second  (3=2) 
estimator.  The  p.  d.  f.  estimate  leading  to  the  larger  statistic  is  chosen  to  be  the 
proper  p.  d.  f.  The  results  of  simulating  the  above  procedure  on  the  computer  for  a 
channel  whose  G state  is  represented  by  a (0,  1)  Gaussian  noise  and  the  B state  is 
represented  by  a beta  distribution  given  by 


f(x)  = 


o:  p: 


X ® (l-x)^ 


0 < X < 1 


elsewhere 


where  o and  /3  are  the  specifying  parameters  of  the  distribution  is  shown  in  Figures 
11  and  12. 

The  values  of  the  statistic  S for  both  states,  G and  B,  are  shown  in  Table  III. 

The  results  of  this  simulation  show  the  excellent  performance  of  the  channel 
noise  estimator  and  the  sensitivity  of  the  statistic  S for  the  proper  channel  noise 
e stimate. 

A different  approach  to  estimation  of  the  channel  noise  conditions  is  based  on 
estimating  the  first  term  in  the  p.  d.  f.  series  expansion  using  the  Hermite  and  the 


Fig.  11.  Fig. 
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TABLE  III.  Values  of  the  statistic  S for  Gaussian  noise  or 
noise  due  to  fading. 


S 


Distribution 

Hermite 

functions 

Laguerre 

functions 

Gaussian  noise 

0.  999 

0.  621 

Noise  due  to  fading 

(from  the  class  of 

one-sided  dist.  ) 

0.  533 

0.  932 

Laguerre  polynomials.  The  estimated  terms,  a^  and  Cj,  are  compared  and  the  set 
yielding  the  higher  value  for  the  first  term  is  used  to  estimate  the  noise  p.  d.  f. 

Figure  13  shows  a block  diagram  that  implements  this  procedure.  The  results  of 
estimating  aj  and  Cj  for  a Gaussian  noise  and  noise  in  a fading  channel  is  shown  in 
Table  IV, 


Fig.  13.  Channel  noise  estimator  for  channels  with  Gaussian 
or  noise  due  to  fading. 

TABLE  IV. 

The  value  of  the  first  term  in 
the  expansion 


Distribution 

using  Hermite 

using  Laguerre 

functions 

polynomials 

Gaussian  noise 

0.  531 

0.  314 

Noise  due  to  fading 

0.  23 

0.697 

The  introduction  of  the  previous  three  channel  noise  condition  estimators  is 
intended  to  show  the  application  of  the  p.  d.  f.  estimator  developed  in  Chapter  3 to  the 
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field  of  communication  engineering.  The  specific  problems  considered  are  intended 
to  be  used  in  the  detection  of  signals  corrupted  by  Gaussian  and/or  non-Gaussian 
noise  (e,  g.  impulsive  or  amplitude  fading).  The  results  of  these  applications  show 

X 

the  usefulness  of  these  procedures  for  channel  noise  identification  and  the  sensitivity 
of  the  statistic  S to  the  proper  noise  model. 
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SUPPRESSION  OF  TIMING  JITTER  IN  PARTIAL  RESPONSE  SYSTEMS 
M.  Kavehrad  and  L.  Kurz 

The  rise  in  demand  for  rapid  data  transmission  through  telephone  lines  led  to 
major  research  activity  in  improving  equalization  and  general  performance  of  PAM 
systems.  In  particular,  numerous  papers  have  been  written  about  the  partial  response 
systems.  The  effort  here  is  concentrated  on  the  study  of  partial  response  systems 
suggested  by  Kretzmer^  if  the  timing  jitter  immunity  constraints  are  included.  In  par- 
ticular, the  timing  jitter  suppression  is  achieved  by  imposing  the  zero-derivative  con- 
straint at  all  sampling  points.  It  is  shown  that  a signal  of  minimum  bandwidth,  satis- 
fying zero-intersymbol  interference  and  zero-derivative  constraints,  is  twice  the 
Nyquist  bandwidth.  It  is  proven  that  the  overall  system  response  in  the  frequency  do- 
main is  triangular  followed  by  a tapped  delay  line.  In  addition,  properties  of  signals 
which  exceed  twice  the  Nyquist  rate  are  explored.  Modifications  of  the  Kretzmer  s 
signals  including  the  timing  jitter  suppression  are  given. 

A.  Mathematical  Formulation  of  the  Problem 

Following  Gibby  and  Smith,^  the  system  response  in  time  r(t)  and  in  frequency 
R(<»))  = A(aj)  are  related,  if  Nyquist  conditions  are  satisfied,  by 


r 


k 


, 00  I’’  /T 


2mr . jukT 
T 


du 


(1) 


where 

u)  = u + and  rj^  = r(kT),  k = 0,  1, 2,  • ■ • 


Introducing  the  notation 

R^  = R(u  + A^  = A(u  + and  = a(u  + 

2 

the  Nyquist  problem  reduces  to 
00 

y A cos  a = T y r,  cos  ukT 

n=-oo  k 


(2) 


sin  a 

n 


sin  ukT 


(3) 


Thus,  signals  satisfying  Eqs.  (2)  and  (3)  eliminate  intersymbol  interference  at  sampling 
points  k = 1, 2,  3,  • • * . 

To  suppress  timing  jitter,  it  is  suggested  that 
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drm 

dt 


t=kT 


= 0,  k - 0,  1.2, 


This  condition  assures  flatness  of  signals  in  the  neighborhood  of  the  sampling  points, 
yielding  immunity  to  timing  jitter.  Using  similar  analysis  to  the  one  suggested  by  Gibby 
and  Smith,  suppression  of  timing  jitter  add  two  additional  design  equations  to  Eqs.  (2) 


and  (3),  namely, 

00  2 

2]  n cos  = (j-)u  y r cos(kTu) 

n = -oo  k 

oo  2 

EnA  sin  a = (- — )u  Y,  r,  sin(kTu) 
^ n n ' 2Tr  k ' 

n=-oo  k 


(4) 


(5) 


We  seek  a signal  of  minimum  bandwidth  which  satisfies  Equations  (2)  to  (5).  If  one 
claims  that  there  exists  a signal  of  bandwidth  mr/T  with  n < 2 satisfying  the  four  equa- 
tions, one  arrives  at  a contradiction.  Referring  to  Eqs.  (2)  through  (5),  for  n < 2 only 
the  term  corresponding  to  n = 0 contributes  to  the  summations  and  Eqs.  (4)  and  (5)  cannot 
be  satisfied  for  0 ^ u^  (2  - n)  tt/T.  Consequently,  the  bandwidth  corresponding  to  at 
least  two  Nyquist  bandwidths  is  required  (n  = 2).  For  the  two  terms  (n  = - 1 and  0)  con- 
tributing to  the  sums  in  the  interval  0 ^ u ^ tt/T,  Eqs.  (2)  to  (5)  reduce  to 


Ap  cos  Oq  ^ 1 1 ~ 


Aq  sin  Oq  + A j sin  a ^ 


-T 


r,  cos  kuT 
k 


r,  sin  kuT 
k 


T V 

-A  , cos  a — u /,  r,  cos  kuT 

- 1 - 1 2it  4-'  k 


-A.jsinn_j=  -l^uY 


r,  sin  kuT 


Solving  the  above  system  of  equations,  one  obtains 

R(u,)  = T(1  - |o.  I)  0 < | u,  | < ^ 


(6) 


(7) 


(8) 


(9) 


(10) 


which  represents  a triangular  spectrum  followed  by  a tapped  delay  line.  For  the  stand- 
ard PAM  system,  Eq.  (10)  reduces  to  the  triangular  spectrum.  For  the  duobinary 
system  suggested  by  Lender,^  Eq.  (10)  reduces  to 


R(u>)  =7  (1  - -J-  U|)cos^ 


0 < 00  < -^ 


(11) 
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The  corresponding  signal  is  then 


r(t) 


sin[Y(t  + ■^)] 

2 

sin[|(t-^)] 

2 

- (t  +-) 

L T ' 2 ^ J 

+ 

(12) 


B.  Signals  of  Bandwidth  Greater  Than  Twice  the  Nyquist  Bandwidth 

Consider  a system  with  signalling  contained  in  the  interval  (-mr/T,  mr /T),  where 
2 < n < 3.  Following  the  procedure  outlined  in  Section  A,  Eq.  (10)  is  replaced  by 


R(to)  = 


T ,,  T . , ljT 

(1  - — oj  ) cos(-^ 


(n  - 2)tt  . . , . , TT 

< w < (4-n)  — 


Ztt  2 ' T - ~ ’ V * ^ 

T , T > /wT>  (4-n)Tr  . . , 

2 (1  COS(~'  -S - 


(4-  n)Tr  . . , ,,  IT 

- -(n-2)Y 


undetermined 


otherwise 


If,  in  a data  transmission  system,  we  transmit  simultaneously  with  high-  and  low- speed 
rate,  the  low- speed  part  of  the  data  stream  would  be  contained  in  ( w | < (n  - 2)  tt/T. 

The  high-speed  part  of  the  data  stream  will  utilize  only  twice  the  Nyquist  bandwidth. 

For  this  type  of  system,  Eq.  (11)  is  replaced  by 


0 


0 < oj 


— T 


TT 


R(w)  = 


i(l-^M)cos(^) 

i(2-*M)cos(^) 


(n-  2)it/T  < I oj  I < (4-  n)iT/T 
(4-  n)Tr/T  < | oo  | < mr/T 


(13) 


Consider  partial  response  signals  with  2 < n < 3.  We  seek  the  form  of  the  partial 
response  signals  which  will  suppress  timing  jitter  and  utilize  minimum  energy  outside 
the  basic  range  of  frequencies,  or  the  solution  to  Eqs.  (6)  to  (9)  subject  to  energy  con- 
straints and  2 < n < 3 is  our  goal. 

After  some  simple  mathematical  manipulations,  the  signal  constraint  may  be 
written  as 

1 ,.(*^~2)7r/T  2 7? 

E = Eq  +-  I [aJ(u)  + aJ(u)  + Aj(u)]du  (14) 

where  Eq  represents  the  fixed  energy  in  the  interval  (n  - 2)Tr/T  < oj  < (4-n)TT/T. 

Using  the  standard  methods  of  the  calculus  of  variations,  the  energy  constrained 
solution  to  Eqs.  (6)  to  (9)  is 
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T 

6 


, ojT  . 
cos(  — ) 


0 < I w I < (n  - 2)Tr/T 


R(w) 


^(1.-31 

2 ' 2tt 


X(5  . 3T 
12  2tt 


, , ojT  , 

) cos(  — ) 


> , ujT  . 

) cos{-r^) 


(n  - 2)Tr/T  < I uj  I < (4  - n)Tr/T  (15) 


(4  - nU/T  < I uj  I < nir/T 


C.  Comparison  of  Signal-to- Noise  Ratios  for  Partial  Response  Signals  With  and  With- 
out Timing  Jitter  Immunity 

In  this  section,  a comparative  study  of  the  Kretzmer^  signals  with  and  without 
timing  jitter  immunity  is  presented.  It  is  shown  that  the  degradation  in  signal-to-noise 
ratio  is  the  same  in  both  cases  and  the  additional  bandwidth  required  for  timing  jitter 
suppression  guarantees  the  same  signal-to-noise  ratio  as  the  signals  which  have  inter- 
symbol interference  immunity  only.  The  comparative  data  for  Kretzmer's  signals  with 
and  without  timing  jitter  immunity  is  given  in  Tables  I and  II,  respectively. 

TABLE  I.  Channels  without  timing  jitter  immunity. 


Signal 

Class 

Partial 

Response 

System 

R(a>) 

Signal- to- Noise 
Degradation  in  dB 

Ideal 

1 

T 

0 

I 

‘^°L 

2T  cos(-^) 

2.  1 

11 

1 + 2D^  + of 

2 , oj  T . 

4T  cos  (“^) 

6.0 

III 

2T  + T[cos  ojT  - 
cos  2ojT] 

- jT[  sin  tj T - 
sin  2u>T] 

1.2 

IV 

^L 

2jT  sin  u)T 

2.  1 

V 

dI 

4T  sin^cjT 

6.0 

- delay  operator 
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TABLE  II.  Channels  with  timing  jitter  immunity. 


Signal 

Class 

Partial  Response 
System 

R(u;) 

Signal-to- Noise 
Degradation  in  dB 

Ideal 

1 

T(l--^M) 

0 

I 

'"°L 

T 

2T(l-^|u>|)cos(^)e  2 

2.  1 

II 

4T(1  - 1 w 1 )cos^(^)e’j‘^'^ 

6,0 

III 

2+Dl- °l 

<-p 

T(1  - "2^  I w 1 )[(2T  cos  uT  - cos  2uiT) 
-j(sinwT-  sin  2u) T)] 

1.2 

IV 

2jT(l  - 1 1 )(sinioT)e‘^‘^^ 

2.  1 

V 

2 4 

-1  + 2D,  - D, 

4T(1  - 1 CO  1 ) sin  (^~)e 

6.0 

- delay  operator 
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ON  ESTIMATION  OF  CHANNEL  CONDITIONS  IN  M-ARY  SYSTEMS  WITH  RANDOM 
SIGNALS  AND  INTERSYMBOL  INTERFERENCE 

L.  Kurz 

This  report  represents  our  extension  of  the  channel- condition  estinnation  prob- 
lem considered  previously.^  An  estimator  for  the  probability  of  error  of  a channel 
disturbed  by  additive  noise  with  random  signaling  and  intersymbol  interference  and 
M-ary  transmission  is  introduced  and  analyzed.  The  estimator  is  based  on  simple 
intuitive  notions  and  is  useful  for  channels  where  almost  all  of  the  intersymbol  inter- 
ference exists  in  the  filters  adjacent  to  the  signal  (active)  filter  (for  instance,  a FSK 
system).  The  statistical  analysis  for  the  M-ary  system  is  more  complicated  than  the 
analysis  of  the  previous  report.^  After  a few  simplifying  assumptions,  an  expression 
for  the  mean  of  the  estimator  is  obtained  which  is  a function  of  the  mean  of  the  esti- 
mator of  the  dependence,  p,  and  the  noise  and  signal  p.  d.  f.  's. 

A.  System  Model 

The  transmitter  sends  one  of  m signals  (Sj,  S2 S^)  with  known  a priori 

probabilities  pj,  P2.  . . . » respectively.  The  receiver  is  composed  of  m filters 
with  each  filter  matched  to  one  of  the  signals.  At  the  end  of  a signaling  interval,  the 
signal  classifier  chooses  which  signal  had  been  transmitted  on  the  basis  of  which 
filter  had  the  largest  sample  output.  In  this  model  it  is  known  that  the  outputs  from 
the  signal  and  idle  filters  are  not  independent.  The  exact  model  to  be  assumed  for 
the  nature  of  dependence  is  not  known.  A typical  modulation  scheme  which  corresponds 
to  this  model  is  the  M-ary  FSK  system.  In  this  scheme  each  of  the  signals  corres- 
ponds to  a pulse  which  is  centered  at  frequency  f. . Each  of  the  frequencies,  L,  is 
different  but  they  are  usually  spaced  closely  together  to  spread  the  signal  energy  over 
as  small  as  possible  a range  of  frequencies.  The  filters  adjacent  to  the  signal  filter 
contain  a considerable  amount  of  energy  from  the  signal  filter  because  the  pulse  shape 
may  not  be  perfectly  shaped,  the  channel  may  have  some  doppler  effect  on  the  trans- 
mitted frequency  and  the  filters  at  the  receiver  are  not  perfectly  orthogonal  to  the 
transmitted  waveform.  In  the  system  model  it  is  then  assumed  that  the  intersymbol 
interference  decreases  as  the  distance  between  the  idle  filter  and  the  signal  filter 
increases.  The  estimator  of  channel  conditions  for  this  model  is  complicated;  it 
requires  an  estimate  of  dependence  between  the  active  (signal)  and  idle  filters  depend- 
ing on  the  distance  between  the  active  and  idle  filters.  The  estimator  of  the  channel 
conditions  would  then  remove  the  dependence  (to  as  large  a degree  as  possible)  and 
calculate  an  estimator  of  the  error  probability  as  if  signal  and  idle  filters  were  inde- 
pendent. 


( 
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In  the  proposed  model,  the  signal  filter  output  is  composed  of  two  components; 
a signal  which  is  a sample  function  of  a stochastic  process  plus  additive  independent 
from  sample-to- sample  noise.  Each  of  the  filters  adjacent  to  the  signal  filter  contains 
an  output  composed  of  a function  g(S)  plus  additive  noise.  The  outputs  of  non-adjacent 
filters  contain  noise  only.  The  p.  d.  f.  of  the  noise  is  the  same  for  all  filters,  the 
function  g(S)  is  the  same  independent  of  which  signal  is  transmitted,  and  the  p.  d.  f.  of 
the  output  signal  is  also  independent  of  which  signal  is  transmitted. 

If  S.  is  transmitted,  the  output  of  the  filter  matched  to  will  be  denoted  by 
+ R,  the  adjacent  idle  filter  outputs  by  g(S^)  + and  g(S^)  + N^^j,  respectively, 

and  the  outputs  of  the  other  idle  filters  by  Nj,  j = 1,  . . . ,i”2,  i+2,  . . . , m. 

B.  The  True  Average  Probability  of  Error 

The  expression  for  the  probability  of  error  must  be  broken  up  into  two  terms; 

one  for  the  case  where  S,  and  S are  transmitted,  and  one  for  the  case  where  S.  is 

1 m 1 

transmitted  for  i = 2,  . . . , m-1.  The  probability  of  making  an  error  given  S.  is  sent  is 
equal  to  the  probability  that  the  signal  is  equal  to  the  value  multiplied  by  the  proba- 
bility that  the  maximum  of  the  idle  filter  will  be  greater  than  given  that  the  signal 
filter  output  is  X2,  summed  over  all  values  of  X2-  The  p.  d.  f.  of  the  maximum  of  the 
idle  filters  will  vary  depending  on  which  signal  is  sent.  In  mathematical  terms,  we 
obtain 


- V 


m r ni  J 

I,  ISl>*  ,5  Pi 

1=1  1 1 


(1) 


where  Pe^(Si)  = probability  of  making  an  error  given  that  was  transmitted  and 


00  00 


Pe  (S  ) = / / f,i  (x^)  ‘^l‘^2 

1 -00  X2  1 

00  CO 

Pe  ^^2)  = / / fm,n(*2'  *2^  ‘^1*^2 

2 *2  ^ 


(2) 

(3) 


where 

^min<*l'*2>  ^ p.  d.f.  of  max 


m-2  independent  noise  samples,  a 
sample  g(Sj)  + N2  given  that 


and 
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m-3  independent  noise  samples,  a 
sample  g(S2)  + N^,  or  sample 
g(S2)  + given  that  S2  + N2  = X2 


C.  Development  of  the  Estimator 

The  p.  d.  f. ' s,  £niin(X] » ^2^  ^m2n(*] » *2^’  dependent  on  all  the  factors 

that  compose  the  output  of  the  signal  filter  but  are  dependent  on  the  part  caused  by  the 
signal  transmitted.  In  terms  of  c.  d.  f. ' s 

''m.nWi..  I"" 


F”-’(x)Fj(x-g(s,))  (5) 

The  appropriate  p.  d.  f.  's  are  obtained  by  differentiating  the  above  equations  yielding 
^mjn'^'U.  = F^(x-g(s.))  f„(x)  + F^-2(x)  F^(x-g(s.) ) (6) 

fm  F2(x-g(s.))  yx)  + 2F^-^x)  F^(x-g(s.)  ) ^x-g  (s.))  (7) 


^sig'^^Z^Is. 

From  Eqs.  (4)  - (8),  it  is  seen  that  given  the  function  g(s^),  the  p.  d.  f . of  the  signal 
and  the  p.  d.  f.  of  the  noise  fn(x),  the  probability  of  error  can  be  calculated. 

It  should  be  noted  that  for  every  value  of  the  signal  Sj^,  the  distribution  of  the 

idle  filters  is  different.  Therefore,  the  samples  from  the  noise  filter  taken  at  one 

time  cannot  be  compared  with  the  samples  taken  at  another  instant  of  time.  If  the 

dependence  between  the  signal  and  noise  filter  were  removed,  all  the  noise  samples 

can  be  used  to  form  - an  estimator  of  Po.  In  forming  Pg  , an  estimation  of  the 

'^m  m 

linear  dependence,  p,  between  the  signal  and  idle  filter  is  made.  Denoting  by  A.  the 

ith  output  of  the  active  filter,  it  will  be  assumed  that  Aj—  A^  are  from  the  first  filter, 

A — A are  from  the  mth  filter  and  A , A,  come  from  the  other  filters, 
p r r+1  K 

The  procedure  used  for  approximating  fnijn  ^ni2n  follows.  All  of  the 

samples  of  the  nonadjacent  idle  filters  are  received  and  stored.  To  approximate 

fm^n*  given  that  the  output  of  the  active  filter  is  A^,  the  value  of  the  nonadjacent  idle 


a*  «4 
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filters  corresponding  to  the  time  A.  was  received  are  first  removed.  The  f is 

then  approximated  by  taking  every  group  of  m- 1 samples,  adding  p . to  the 

(m-l)th  sample  and  then  taking  the  maximum  of  the  m- 1 samples,  where 

*est  i”'^i  ~ ^ ^ statistical  average  of  all  nonadjacent  idle  filters.  The  kth 

of  these  samples  will  be  denoted  by  ^ Mj^  . The  set  of  samples  {.M.  } are  an  approxi- 

^k  t 


mation  to  a Set  of  samples  of  the  distribution  with  p.  d.  f.  f 


given  that  the  output 


due  to  the  signal  is  ..  A similar  procedure  is  followed  to  approximate  f^  The 

estimate  of  the  probability  of  error  contains  two  terms:  one  which  estimates  the 
probability  of  error  given  the  first  or  the  last  filter  is  active  and  one  which  estimates 
the  probability  of  error  given  one  of  the  filters  is  active,  or 


, r d 

■ (Pi  P ) ')  } u(A.  - M.  ) 

'f'l  t'm'  rd  Aj  J 1 Ji 

/m-1  \ r k d 


where  u(.  ) is  the  unit  step  function,  and  d is  the  maximum  integer  of 

Following  a similar  procedure  to  the  one  outlined  in  Ref.  1,  it  can  be  shown  that  the 

estimator  P is  related  to  the  Mann- Whitney  statistic. 


is  related  to  the  Mann-Whitney  statistic. 


D.  Estimation  of  the  Linear  Dependence  Coefficient  p 

In  general,  g(s)  is  unknown  and  it  is  impossible  to  develop  general  procedures 
to  estimate  it.  If  the  dependence  is  weak,  as  is  the  case  in  most  practical  systems, 
satisfactory  answers  are  obtained  if  g(s)  is  replaced  by  p s.  The  linear  coefficient, 
p , can  then  be  estimated  using 


f A.I 


) I A. + ) 

. ^ , a.  1 . I—  , 

J=P+1  Jg  ■'  J = 


. "'a. 

Ju  Je 


- I A 


where 


Aj  = output  of  the  active  filter  at  the  jth  instant 

I-  = output  of  the  adjacent  idle  filter  at  the  jth  instant  which  corresponds  to 
ju 

the  filter  which  is  matched  to  a signal  greater  than  the  active  filter 

la  = output  of  the  adjacent  idle  filter  at  the  jth  instant  which  corresponds  to 
je 

the  filter  which  is  matched  to  a signal  less  than  the  active  filter 


, p r k 

— ;i  + Tl  + )(I  +1) 

k-r  J—.  a.  .^,a.  .^,  a.  a. 

j=l  JU  J=p+1  je  j=r+l  ju  je 


In  deriving  the  expression  for  the  estimator  of  the  assumption  was  made  that 
the  decisions  at  the  receiver  were  correct.  This  assumption  affects  the  accuracy  of 
the  estimator  insignificantly. 
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A NEW  CLASS  OF  NON- RANK  ROBUST  DETECTORS 
J.  I.  Cochrane  and  L.  Kurz 

In  this  report,  a new  class  of  non-rank  robust  detectors,  which  are  related  to 

the  gene ralized  quantile  detector,  is  considered.  Unlike  the  non-rank  detectors 

based  on  quantile  statistics  with  fixed  scoring  vectors,  the  new  class  of  detectors 

uses  a random  scoring  vector.  The  new  statistics,  called  C.  statistics,  includes  the 

L ^ 

Ching-Kurz  statistics  as  a special  case  and  allow  useful  asymptotic  expressions  for 
a wide  class  of  linear  non-rank  tests.  The  notation  used  in  this  report  is  the  same  as 
in  Reference  1. 

A.  Asymptotic  Normality  of  the  C.  Statistics 

The  T statistics  of  Ref.  1 are  designed  in  three  interrelated  steps.  First,  the 
i 

vector  [ 1 is  chosen  to  achieve  the  desired  weighing  of  the  empirical  dis- 
tribution functions.  Second,  the  vector  [Vp  . . . , chosen  to  maximize  some 

figure  of  merit.  Third,  the  best  scores  [Cj that  chosen  to 

maximize  some,  possibly  different,  figures  of  merit.  For  example,  the 
selected  to  guarantee  robustness  over  a wide  class  of  distributions,  while  Q^p^ 
chosen  to  maximize  the  efficacy  for  a particular  distribution. 

The  process  of  forming  the  value  of  T^  is  shown  schematically  below: 


In  this  section  we  form  the  new  statistics  by  using  these  same  steps  in  a different 
order  to  avoid  the  ranking  necessary  to  observe  . Let 


>4 


i 
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where  [X  j <.  . . < Xj^]  is  an  arbitrary  set  of  thresholds  and  aj^  ^ • • • » 

is  a set  of  scoring  functions  defined  on  [ 0 <tj  ...  tj^  < 11 . These  statistics  are 
formed  by  the  process  shown  as  follows 


When  a,  . = C.,  j = 1 k,  Eq.  (1)  reduces  to  the  m-interval  tests  of  Ching  and 

2 J J 

Kurz.  Intuitively,  the  T^  and  C^  tests  are  closely  related.  In  one  case  \ is  fixed  and 

X is  a random  vector,  while  in  the  other  X is  fixed  and  \ is  the  random  vector.  The 

similarity  will  become  even  more  apparent  for  the  particular  cases  considered  in  this 

section.  The  following  theorem  makes  precise  the  conditions  for  asymptotic  normality 

of  the  C.  statistics. 

1 


Theorem  1.  Let  {X.j,  i 1 M;  j = 1 n. } be  a set  of  observations, 

where  X..  is  the  set  of  n.  i.i.d.  observations  from  F..  Let  F.  be  a set  of  strictly 
monotonic  distributions.  Assume  that  the  set  of  functions  F^H  (t)  on  [c  , 1-e  ] for 
any  c > 0 have  uniformly  equi- continuous  derivatives.  Further,  assume  that  the 


scoring  functions  a^^  ^(tj tj^),  j = 1,  . . . , k are  bounded  and  have  uniformly  equi- 

continuous  second  partial  derivatives.  Then  the  vector  Cj^,  i = 1 M converges  in 

distribution  to  a jointly  normal  vector  C.^,  i = 1, . . . , M with 


E{C.  } = 7 
lo'  .L 
J = 1 
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where  f\, 1 is  a set  of  parameters  uniquely  defined  by  = H(X.),  and 

IK  11 


covi 


^ -1 


{C.  ,C.}  = > n'‘r6.  a + 3 C |7  [5.  a + R C ] 

lo’  jo^  II  ^ lu- 


(3) 


where  8..  is  the  Kronecker  delta  and 
ij 


- 1 ^k,k1 


C^  = [C.  ,...,C.  1 
u ^ lu  * ku  ^ 


k 3a,  . 

C.  - ^ 


t =\. 
u u 


F 

u j 


(4) 


and  y is  a k by  k matrix  with  elements 

i 11  * 


(7..  F 1-F  i < j (5) 

Proof.  (Because  of  its  length  and  complexity,  the  proof  is  omitted. ) Under  the 
null  hypothesis,  the  variance  simplifies  sharply,  since  C^^  = C^^  and  = 5^  for 
u,  V = 1 M. 

For  a particular  choice  of  X,  it  is  possible  to  choose  the  score  functions  and 
their  derivatives  to  control  the  asymptotic  distribution  of  C. . In  sections  B and  C two 
design  techniques  for  the  class  of  C^  statistics  are  suggested. 

B.  Nonparametric  Aspects  of  the  C.  Statistics 

The  distribution  of  C^  is  dependent  on  the  quantiles  of  the  underlying  distribution, 
even  under  the  null  hypothesis,  and  the  nonparametric  property  of  fixed  Type  I error 
for  the  tests  of  Ref.  1 is  lost.  If  the  only  problems  of  interest  were  binary  tests  with 
the  Neyman-Pearson  formulation  (constant- size  test),  this  is  a serious  drawback.  If, 
however,  the  statistics  are  designed  for  binary  tests  with  minimum  probability  of 
error  or  for  more  general  composite  hypothesis  testing,  the  nonparametric  property 
is  less  important  than  other  aspects  of  the  desired  robustness.  This  point  may  best 
be  illustrated  by  a simple  example. 

Consider  a two-sample  problem  using  the  Mann-Whitney  statistic.  In  asymp- 
totic analysis,  the  separation  between  the  expected  value  of  the  test  statistics  is 


00 

E{T,Ih,}  - E{TjlH^}oc  / FjdF^ 

00 
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and  if  minimum  probability  of  error  is  to  be  achieved  (asymptotically)  the  decision 

must  be  to  accept  H if  T,  < C and  H,  if  T,  > C where 
o 1 — 1 1 — 


00 

C CX  / F,dF2 
-.00 


But  the  choice  of  this  threshold  cannot  be  made  without  a priori  knowledge  of  this 
parameter  or  an  adaptive  or  learning  procedure  to  set  this  threshold.  In  this  example, 
the  nonparametric  property  of  the  rank  tests  is  not  helpful  in  selecting  the  decision 
partition.  In  M-ary  and  composite  hypothesis  testing  problems,  the  choice  of  critical 
regions  in  the  decision  space  spanned  by  the  intermediate  statistic  is  always  dependent 
on  some  parameters  of  the  underlying  distribution.  In  this  sense,  the  absence  of  the 
nonparametric  property  of  fixed  Type  I error  in  non- rank  tests  is  not  important. 

The  asymptotic  distribution  of  C.  is  expressed  in  terms  of  \,  which  is  not  known 
for  a general  nonparametric  problem.  This  difficulty  may  be  approached  in  two  ways. 
First,  in  stationary  noise,  it  is  possible  to  use  iterative  techniques  to  form  estimates 
of  \ for  a fixed  X which  converge  with  probability  one  to  the  true  value  of  H(X).  Alter- 
nately, X may  be  controlled  by  stochastic  approximation  techniques  so  that  they  con- 
verge with  probability  one  to  the  desired  values  H”^(X.)  for  a fixed  set  V.  Second,  it 
is  possible  to  choose  scoring  functions  so  that  the  expected  value  and  variance  of  the 
C.  statistics  are  locally  distribution-free.  Using  this  techniques,  the  C.  statistics  are 
shown  in  Section  D to  be  asymptotically  locally  equivalent  to  the  T^^  statistics  of 
Reference  1. 


C.  Minimum- Variance  Scores 


A useful  figure  of  merit  for  asymptotic  analysis  of  the  C.  statistics  is  TrfFj^] 
where  Fj^  is  the  MxM  covariance  matrix  of  C^,  i = 1,  . . . , M.  For  normal  vector 
statistics,  this  is  the  average  variance  of  the  independent  set  of  random  variables 

X z z 

B = TC  where  T FT  = diagftr  j,  . . . For  the  normal  specified  by  Eqs.  (2) 

and  (3)  under  the  null  hypothesis. 


Tr{F„}  = 


M 

I 


•714*2 


V 1 1 


C^^a  + M 


M , 

^ 3k 


C 


(6) 


where  C,  a and  ^ are  defined  in  Section  A.  Since  C does  not  appear  in  the  expected 
value  of  it  is  possible  to  select  C to  minimize  Tr{F^}  independent  of  a.  This 

minimization  yields 
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C . 

“ opt 


0.  n.-’ 

M - , 

M ''  3^nr 
*-^11 


(■?) 


and 


min  {Tr  [ F^l  } 

o 


Tv 


— 1 


1 1 


M''  3?n:^  , 

- 1 J 


(9) 


The  selection  of  the  {3j^}  a-nd  a vectors  for  a particular  problem  must  consider  more 
complex  figures  of  merit;  for  £ given  by  C = ka  and  a given  {3j}»  the  asymptotic 
covariance  of  C^^  under  the  null  hypothesis  is  identical  to  within  a constant  to  the  T. 
statistics  of  Reference  1.  The  asymptotic  expected  values  are  identical.  All  of  the 
techniques  and  results  of  Ref.  1 are  thus  applicable  to  the  C^  design  problems. 

It  is  interesting  to  note  that  Eq.  (7)  is  minimized  over  Q by  selecting 
3j  = . . . = 3j„  = the  resulting  variance  is  given  by 


min  min  Tr{r  } = a'^Y a ( T u.  S 

g C o - --  JVi  1 


(9) 


= 3jj^  achieves  equality 


This  rather  surprising  result  is  verified  by  noting  that  3j  = 
for  the  Stieltjes  integral  formulation  of  the  Cauchy-Schwartz  inequality.  For  that 
special  case 

C^„.^  = - a (10) 

* OPT 


D.  Locally  T^  Equivalent  C^  Statistics 

In  the  previous  discussions,  it  has  been  emphasized  that  threshold  statistics 

are  not  nonparametric  under  the  null  hypothesis.  In  particular,  the  expected  value  of 

is  a function  of  the  unknown  values  V = H(X.).  In  this  section,  we  seek  to  minimize 

this  uncertainty  in  the  expected  value  under  the  null  hypothesis  by  suitable  choice  of 

the  scoring  function.  From  the  definition  of  C in  Eq.  (3),  the  only  influence  that  can 

be  applied  to  a. (X,, ...  , X.  ) to  control  the  asymptotic  expected  value  is  the  set  of  first 
j 1 k 

partial  derivatives  of  a^.  We  choose  those  partials  so  that  the  expected  value  remains 
approximately  constant  in  a neighborhood  of  the  design  values  Xj, . . . , Xj^,  by  setting 

(11) 


k 9a. 


a.X. 
1 1 


which  is  equivalent  to  setting 


306 


COMMUNICATIONS 


C = - a 


(12) 


where  the  terms  of  order  br  have  been  neglected.  Using  this  value  insures  that  the 

-1/2 

expected  value  of  C.  will  remain  constant  to  within  a term  of  order  n^^^  for  small 
errors  in  setting  the  thresholds  [Xj,...,X^].  From  the  previous  section,  this  is 
also  the  solution  for  C which  minimizes  the  average  variance  when  = . . . j3^. 
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THE  M-INTERVAL  PARTITION  DETECTOR  (MIPD)  WITH  DEPENDENT  INPUT  DATA 
P.  Kersten  and  L.  Kurz 

Detectors  that  are  nonparametric  in  the  sense  that  the  Type  I error  is  independent 
of  the  underlying  distribution,  lose  this  property  once  the  samples  used  become  depend- 
ent. In  the  past  it  has  been  assumed  that  the  samples  used  in  the  MIPD  are  independent. 
Once  this  assumption  is  dropped,  the  performance  of  this  test  under  the  hypothesis  can- 
not be  guaranteed.  However,  the  situation  can  be  circumvented  to  some  extent.  Woinsky 
and  Kurz^  proposed  a class  of  detectors  which  retain  their  nonparametric  character 
with  dependent  samples.  By  properly  grouping  the  dependent  samples  one  creates  a 
sequence  of  independent  vector  samples  of  dimension  n.  This  is  accomplished  by  placing 
n consecutive  samples  in  a vector  of  length  n and  then  skipping  y*  1 samples  to  ensure 
that  the  next  vector  of  n components  is  independent  of  the  first  vector.  This  assumes 
that  one  is  sampling  at  y times  the  rate  required  to  obtain  independent  samples  y = 1, 2,  • • • . 
One  then  applies  a fixed  transformation  from  n-dimensional  to  one-dimensional  space 
yielding  a sequence  of  independent  random  variables.  Mathematically,  this  can  be 
stated  using  the  notation  of  Ref.  2 in  the  following  manner.  Let  X ^ , • • • , be  a sequence 
of  independent  samples  with  a total  time  of  observation  of  NT.  The  sampling  rate  is 
then  increased  by  a factor  y and  the  resulting  sequence  is  denoted  by  Y^,  • • • . Yj^^. 

Grouping  n consecutive  Y samples  into  a vector  _Z  the  resulting  sequence  of  vectors  is 

denoted  by  ' ..^^/(n+y-  1)]  ^ ^'^(Nn+y-  l)(i-l)+r  ' ’ ' ’ 

^(n+y-  l)(i-  l)+n^‘ 

Under  the  null  hypothesis  the  ^ joint  distribution  function  given  by 

F(x^,  • • • ,x  ) and  under  the  alternative  they  are  samples  from  a noise  distribution  which 

includes  the  signal.  After  passing  through  the  predetector  which  is  actually  just  a 

transformation  £,  the  resulting  sequence  £{Z^),  • • ■ ,£i,Z^),  where  L = [Ny/(n+y-l)]  , 

has  a distribution  F under  the  null  hypothesis  and  G under  the  alternative.  The 
z ^ 

samples  are  used  in  a standard  nonparametric  detector  to  obtain  a constant  false  alarm 
rate.  The  central  idea  which  makes  this  model  so  simple  and  yet  so  effective  is  to 
concentrate  the  effect  of  the  dependence  within  the  transformation  and  thus  allow  the 
detector  to  operate  in  a nonparametric  mode. 

The  effect  of  the  transformation  is  to  create  a new  sample  space  whose  underlying 
distribution  is  determined  by  £ and  the  distribution  of  the  samples  Y.  The  transforma- 
tion £ is  selected  to  be  the  sample  mean.^  Note,  that  for  independent  Gaussian  samples 
with  known  variance  this  is  a sufficient  statistic. 

Following  Ref.  3,  one  evaluates  the  performance  of  this  detector  against  another 
detector  via  the  asymptotic  relative  processing  time  ARPT  which  is  defined  to  be 
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ARPTITj.T^)  = ARE(Tj,  T2)x(t2/tj) 


where  the  AREITj.T^)  is  the  asymptotic  relative  efficiency  of  the  first  detector  with 
respect  to  the  second  and  t^  is  the  time  between  samples  for  detector  T^.  Note  that  if 
^2  ~^1'  implies  that  all  the  samples  are  independent  and  the  ARPT  reduces  to  the 


ARE. 


For  many  symmetric  distributions  of  T^,  the  transformation 


will  be  approximately  normally  distributed  with  zero  mean  and  with  variance 


miniyj^n)-  1 


= (T^d  +2  y (1  - ^)p  (kT/v)) 
^ k=l  * 


where  py  is  the  correlation  coefficient  of  and  I”  general,  define 


^ Y 

n ^ (n+y-  l)(i-  l)+j' 


To  apply  this  to  the  m- Interval  Partition  Test,  one  now  uses  the  samples  Z to  obtain 


the  quantiles  of  F^.  The  MIPD  and  its  first  moment  are  given  by 


EIT,  I H^)  = i I S,  *1'  ' i Z Pi  = » 

J = l 1=1  1=1 


where 


®i  ^ ^®i-l'“il’  "i  ^ 


Accordingly, 


t 7 

Var(T  I H ) = — y P 
' 2 ' o mn  ^1 
1=1 


E(TjHj)=  Spi(F,(«i-e)-F^(«..j-e)). 


'«  •»»  «4 


On  the  other  hand,  for  the  MIPD  constructed  upon  the  independent  samples  one  ob- 
tains an  efficacy  of  the  exact  same  form  with  p replaced  with  b,  a replaced  by  a and 
f^  replaced  with  f^.  Therefore,  the  asymptotic  relative  processing  time  is 


ARPT{T^,TJ  = (v/{n+Y-l)(^(T^)/^(T^)). 

To  examine  a specific  case,  it  is  assumed  that  both  T and  T use  the  locally 
4 z X 

most  powerful  scores,  i.e.,  B = f (a.  - f (o. ) and  b.  by  the  same  formula  with  a 

^ z 1-1  z 1 1 

replaced  with  a and  f^  replaced  with  f^.  Then  the 


m ^ rn  ^ 

ARPT(T^,T^)  = (Y/(y+n-l))(^  p. V ^ bp. 

i=l  i=l 


I 

[ 

I 

i- 

h 


If  y = 1 so  that  Y.  = X.  are  i.  i.  d.  N(0,  1 ) samples,  then  P(Z  < x)  = x)  and  j 

9(Jn  a.)  = i/m  = $(a.)  which  implies  a.  = a./n^  . Then  1 

”11^11  I 

f^(Q'^)  = (n/2TT)^^^exp(-Q'^^n/2)  = (n/2Tr  exp(-a?/2)  = n^^^f^(a^)  | 

2 2 

so  p.  = nb.  and  the  ARPT  = 1.  However,  should  the  samples  be  dependent,  the  ARPT 
^^22 

ARPT  = (■Y/(n+Y- l))/o-  where  a is  the  variance  given  by  Equation  (1).  This  is  Eq.  (6) 
of  Ref.  3,  and  shows  that  the  ARPT  increases  as  shown  in  Figs.  1-3  of  the  same 
reference,  precisely  as  the  sign  statistic  and  Wilcoxon  Test  Statistic. 

The  performance  of  both  the  Wilcoxon  and  Sign  Test  when  the  noise  is  from  a 
Double- Exponential  distribution  is  degraded  with  respect  to  its  performance  in  a 
Gaussian  noise  environment.  The  ARPT  for  the  Double- Exponential  is  Z/ir  times  and 
I/tt  times  the  ARPT  for  the  Gaussian  CDF  for  the  Wilcoxon  and  Sign  Test  respectively. 

One  suspects  that  the  ARPT  for  the  MIPD  should  be  greater  than  that  of  the  Sign  Test 
under  the  same  comparison.  Calculation  reveals  that  for  m = 4 partitions  the  corres- 
ponding multiplicative  factor  is  0.43.  Although  this  is  not  as  good  as  the  Wilcoxon 
Statistic,  one  must  remember  that  the  MIPD  is  considerably  easier  to  implement  than 
the  Wilcoxon  Test.  For  m > 4 the  gap  between  the  two  tests  is  essentially  closed. 
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A ROBUSTIZED  CUMULATIVE  DECISION  FEEDBACK  SYSTEM 
H.  S.  A shtiani  and  L.  Kurz 

Since  it  was  proved  that  transmission  with  vanishingly  small  errors  is  possible, 
at  least  in  theory,  as  long  as  the  information  rate  is  below  channel  capacity,  there  has 
been  a continuous  effort  on  the  part  of  investigators  to  devise  methods  to  achieve  or  at 
least  approach  this  goal. 

This  effort  has  been  pursued  in  several  directions.  One  direction  has  been  to 
develop  sophisticated  coding  and  decoding  schemes  to  detect  and  correct  errors 
occurring  during  transmission  so  that  the  final  probability  of  error  can  be  kept 
arbitrarily  low;  this  method,  in  many  cases,  has  resulted  in  long  sequences  of  coded 
messages  and  complex  coding  and  decoding  schemes  that  are  not  very  desirable  from 
a practical  point  of  view. 

Another  approach  has  been  to  include  a feedback  link  in  the  transmission  system. 
This  approach  takes  into  account  the  fact  that  in  practice  a second  link  is  often  avail- 
able whether  or  not  it  is  used  as  a feedback  link. 

The  price  that  has  to  be  paid  for  improving  the  error  rates,  assuming  fixed 
signal  power,  in  both  cases  is  a lower  information  rate.  Attention  has  been  paid  to 
find  ways  of  improving  error  rates  without  compromising  the  information  rate  of  the 
system  too  much.  In  the  first  approach  this  has  resulted  in  more  efficient  coding 
techniques,  and  in  the  second  approach  in  feedback  systems  that  do  not  disregard  the 
ambiguous  messages,  and  use  more  of  the  information  received,  e.  g. , cumulative  or 
sequential  binary  decision  feedback  system  (CBDFS). 

In  this  report,  we  attempt  to  robustize  a sequential  detector  known  as  the  cumu- 
lative binary  decision  feedback  system  (CBDFS).  ^ 

By  robustizing,  in  the  context  of  the  decision  feedback  system  under  consi^  • ra- 
tion, we  mean  desensitizing  the  performance  of  the  system  to  impulsive  noise.  This 
is  accomplished  by  means  of  setting  thresholds  in  the  decision  device,  and  identifying 
the  samples  falling  in  the  regions  beyond  the  thresholds  as  ambiguous  samples.  In 
the  cumulative  system  under  consideration  these  samples  are  not  disregarded;  however, 
the  decision  is  postponed  until  the  sum  of  the  repeated  sequence  of  samples  lie  in 
regions  identified  as  "safe"  regions  for  making  decision.  These  safe  regions  lie 
between  a null  zone  and  the  two  regions  beyond  impulsive  noise  thresholds.  The  null 
zone  is  included  to  protect  against  decisions  made  on  small  samples  considered 
ambiguous  as  the  result  of  Gaussian  noise  interference. 
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A,  The  Cumulative  Binary  Decision  Feedback  System  in  Gaussian  Noise 

The  CBDFS  in  Gaussian  noise  has  been  discussed  and  formulated  in  Reference  1, 
In  this  system  the  receiver  integrates  the  sequence  of  repeats  of  a binary  digit  (the 
message  unit)  until  the  integrated  signal  passes  the  null  zone  boundaries  -kj  or  kj, 
in  which  case  "zero"  or  "one"  is  accepted. 

The  detection  problem  of  finding  the  average  detection  time  (or  average  sample 

3 

number)  corresponds  to  the  first-passage  time  in  the  random  walk  problem. 

Let  Pj(z,  r),j_Q  j represent  the  joint  probability,  when  is  being  sent,  that 
after  r transmission,  the  integrated  signal  is  z,  and  that  z has  remained  ambiguous 
(i.  e.  , in  the  null  zone)  on  all  previous  transmissions.  In  the  limiting  case,  when  the 
number  of  transmissions  per  message  digit  is  large,  the  central  limit  theorem  is 
applicable,  so  that  the  integrated  signal  has  a Gaussian  distribution  in  the  absence  of 
the  null  zone  boundaries,  Pj(2»  may  be  evaluated,  for  large  r,  by  evaluating  the 
corresponding  quantity  P.(z,t).  The  integration  time  is  given  by  t = r/a,  a being  the 

J 1 

number  of  transmissions  per  unit  time. 

From  the  random  walk  problem  Pj(z,t)  is  given  as  a solution  of  the  partial 
differential  equation 


aP5(z.t) 


9P-(z,t) 


az 


+ D. 


a p.(z.t) 


subject  to  the  boundary  conditions 
P.(kj,t)  = Pj(-kj.t)  = 0 

Pj(z,o)  = 6(z) 


and  where 


B.  = -as. 

J J 

D.  = a N/2 
J 

with  Sy  the  mean  of  the  received  signal,  and  N,  the  corresponding  variance. 

The  average  probability  of  error  and  the  ASN  for  a symmetrical  system  with 
equal  a priori  digit  probability  and  when  the  digits  are  transmitted  by  sending  s 
through  a channel  of  additive  Gaussian  noise  of  power  N is  given  in  Reference  1. 
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k 2s/Nkj 
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B«  The  CBDFS  in  Impulsive  Noise 

In  this  section  the  performance  of  the  cumulative  binary  decision  feedback 
system  is  analyzed  in  an  environment  of  impulsive  noise. 

The  impulsive  noise  is  often  identified  by  a distribution  having  a large  variance. 
This  characteristic  is  reflected  in  the  tail  of  the  distribution  (i.  e.  , thicker  tails). 

One  problem  associated  with  analyzing  the  effects  of  impulsive  noise  in  communica- 
tion systems  is  finding  a suitable  distribution  that  has  the  thick  tails  indicated  by  an 
impulsive  noise  interference  and  that,  at  the  same  time,  permits  mathematically 
tractable  analysis.  One  such  model  is  given  by  functions  of  the  form 


f(x)  = 


1 < II  < 2 . 


A simple  yet  reasonably  accurate  characterization  is  obtained  using  the  Laplacian 
distribution  f^^lx)  = y /2  . In  this  section  the  CBDFS  is  analyzed  in  an  impulsive 

noise  environment,  using  this  distribution.*  With  a Laplacian  distribution,  the 
Fokker- Plank  equation  of  Section  A takes  the  form 


aPj(z.t) 

at 


aP;(z,t) 


= SON  (z)  — — + 1 /y  — i 


a P;(z,t) 


az 


j = 0,  1 


(1) 


where  SON  = 


z > 0 
z < 0 


The  boundary  conditions  are  the  same  as  in  Section  A. 
thresholds  k^  and  -k2  these  boundary  conditions  are 

Pj(k2«  t)=P.(-k2,t)  = 0 


Assuming  symmetrical 


Pj(z,0)  = 5(z) 


(2) 


Using  the  classical  method  of  separation  of  variables,  Eq.  (1)  is  solved  subject  to  the 
boundary  conditions  Eq.  (2),  yielding 


* A more  general  noise  model  in  this  class  is  treated  by  R.  Chassaing  and  L.  Kurz 
elsewhere  in  this  report. 


Var(x)  = = N 

V 

C,  A Robustized  CBDFS 

In  this  section  we  develop  a robustized  version  of  the  cumulative  binary  decision 
feedback  system  discussed  in  Sections  A and  B. 

Consider  the  case  of  binary  signal  transmission  through  a noisy  channel,  where 
noise  is  assumed  to  be  a mixture  of  Gaussian  and  impulsive  noise.  This  mixture  is 
represented  by  a c.  d.  f.  of  the  form 

F(x)  = (1  - e ) P(x)  +cH(x) 

where  P(x)  is  a low  variance  Gaussian  distribution  representing  the  background  noise, 
and  H(x)  is  a large  variance  distribution  representing  the  occasional  excursions  of 
impulsive  noise. 

The  purpose  of  this  section  is  to  improve  the  reliability  of  the  performance  of 
the  CBDFS  in  impulsive  noise  through  desensitizing  its  performance  to  the  impulsive 
noise  component  of  the  mixed  distributed  interference. 

As  suggested  by  Rappaport  and  Kurz,  this  improvement  may  be  obtained  by  the 
addition  of  an  upper  and  a lower  threshold,  k2  and  -k2  in  the  detector.  As  the 
result,  the  signal  region  is  divided  into  five  regions.  Two  of  these  regions  are 
regions  of  acceptance  of  the  two  binary  levels;  the  upper  is  the  region  of  acceptance 
of  "1,  " and  the  lower  is  the  region  of  acceptance  of  "-1.  " The  middle  region  is  the 
null  zone  or  the  region  of  small  signal  ambiguity,  and  the  two  outer  regions  are 
regions  of  large  signal  ambiguity  (see  Figure  1). 
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UPPER 

LARGE  SIGNAL 
AMBIGUITY  REGION 


REGION  OF 


OF  BINARY 
LEVEL  "1“ 


REGION  OF 
ACCEPTANCE 
OF  BINARY 
LEVEL  "-1" 


LOWER 

LARGE  SIGNAL 
AMBIGUITY  REGION 


Fig.  1,  Decision  thresholds. 


During  transmission,  the  transmitter  repeats  a digit,  and  the  receiver 
integrates  the  received  sequence  of  repeats.  When  the  integrated  signal,  z,  crosses 
one  of  the  four  boundaries  ^2'  receiver  accepts  "1"  if 

kj  < z < k2  and  accepts  "-1"  if  < z < -kj.  The  transmitter  is  then  asked,  via 
the  feedback  link,  to  start  sending  the  next  digit. 


RCBDFS  Formulation 


The  analysis  of  the  CBDFS  is  considerably  simplified  by  the  assumption  that  the 
impulsive  noise  component  of  the  mixture  distribution  is  of  bounded  variance. 

This  assumption  permits  us  to  invoke  the  central  limit  theorem,  and  as  the 
result,  the  relatively  simple  form  of  the  Fokker-Plank  equation  remains  valid. 
However,  the  boundary  conditions  are  modified  by  the  introduction  of  an  additional 
set  of  thresholds  in  the  received  signal  space. 

Let  Pj(2»  1 joint  probability,  when  x^  is  being  sent,  that 

after  r transmission  the  integrated  signal  is  z,  and  that  z has  remained  ambiguous 
(i.  e.  , either  in  the  null  zone  or  in  one  of  the  two  large  signal  ambiguity  regions)  on 
all  previous  transmissions.  In  the  limiting  case,  when  the  number  of  transmission 
per  message  digit  is  large,  the  central  limit  theorem  is  applicable,  so  that  the 
integrated  signal  has  a Gaussian  distribution  in  the  absence  of. the  ambiguity  region 
boundaries;  this  Gaussian  distribution  corresponds  to  the  sum  of  two  Gaussian 
distributed  r.  v. ' s each  corresponding  to  one  of  the  components  of  the  mixture  noise; 


Pj(z,  r)  may  be  evaluated,  for  large  r,  by  evaluating  Pj(z,t),  where  integration  time, 
t,  is  given  by  t = r /a,  where  & is  the  number  of  transmissions  per  unit  time. 


From  the  random  walk  problem  Pj(z,t)  is  given  as  a solution  to 
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3Pj(z.t) 


at 


= B. 
j 


ap.(z,t) 


dz 


+ D. 


a PjU.t) 


az‘ 


j = 0,  1 


(3) 


subject  to  the  boundary  conditions 
P.(kj,t)  = P.(-kj,t)  = 0 

P.fk^.t)  = Pj(-k2,t)  = 0 

P^(z,0)  = 6(z)  (4) 

where  B.  = -as. 

J J 

D.  = a N/2 
J 

With  s,  the  mean  of  the  received  signal  and  N,  the  variance  of  the  mixture  noise 
distribution  (i,  e.  , sum  of  the  two  variances  of  the  two  Gaussian  components). 

Using  the  method  of  separation  of  variables  Eq.  (3)  is  solved  subject  to  the  boundary 
conditions  Equation  (4) 


Following  similar  steps  as  in  Ref.  1,  the  average  probability  of  error,  and  the  ASN 
for  a symmetrical  bipolar  binary  signal  of  level  s,  transmitted  through  a channel  of 
additive  Gaussian  and  impulsive  noise  of  total  power  N,  are  given  by 


P 

e 


_ k,k_ 
1.2s  12 

1 + e Ti 


N kj+k^ 


2 s 


^1*^2 


^1^2 
s(kj  + k2) 


N kjt-k^ 


k,k_ 

2 8 12 
e T? 


N kj  + k^ 


+ 1 


Figure  2 shows  the  performance  of  the  CBDFS,  with  symmetrical  null  zone 
boundaries  -kj  and  k^,  in  the  mixture  noise  environment  in  terms  of  ASN  as  a 
function  of  signal-to-noise  ratio,  at  a fixed  error  rate.  For  simplicity  the  signal 
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level  is  fixed  at  1.  Two  impulsive  noise  intensities,  c = 5%  and  e - \0%  and  two 

error  rates,  P =.01  and  P =.0001,  are  assumed. 

' e e 


Fig.  2.  Performance  of  CBDFS. 

In  Fig.  3 the  performance  of  the  robustized  CBDFS,  with  symmetrical  impulsive 
noise  thresholds  and  k2,  in  the  same  mixture  noise  environment  as  the  CBDFS 

and  under  the  same  assumptions,  are  given.  It  is  seen  that  a significant  reduction  in 
the  average  sample  size  of  CBDFS  is  achieved  as  the  result  of  protecting  its  per- 
formance against  impulsive  noise.  This  improvement  was  expected  because  the 
CBDFS  is  a parametric  detector  and  is  not  robust  with  respect  to  variations  in  the 
noise  distribution,  e.  g.  , occurrence  of  impulsive  noise. 


Fig.  3.  Performance  of  robustized  CBDFS. 
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SEQUENTIAL  PARTITION  DETECTORS  WITH  DEPENDENT  SAMPLING 
R.F.  Dwyer  and  L.  Kurz 

In  this  report,  the  theory  of  sequential  partition  detectors  with  dependent  sampling 
is  introduced.  A new  formulation  is  given  which  predicts  the  thresholds  under  q -depend- 
ent sampling  in  order  to  maintain  the  same  error  probabilities  as  in  the  independent 
sampling  case.  A comparison  is  made  between  independent  and  dependent  sequential 
partition  detectors  based  on  the  average  time  to  detection.  Under  stated  conditions  de- 
pendent sequential  partition  detectors  show  improved  efficiency  for  both  Lehmann  and 
shift  of  the  mean  alternatives. 


A.  Sequential  Partition  Detectors  (SPD) 

It  was  shown  in  Ref.  1 that  the  SPD  test  statistic  was  given  by 


T 

n 


n m 

-<  -J  k ik 
i k 


(1) 


where  bj^  = called  scores  and  n.j^  is  a counting  device  for  keeping  track 

of  which  interval  the  observations  fall  in. 


Thus,  Eq.  (1)  represents  the  classical  cumulative  sum  form  for  the  SPD  with  stop- 
ping boundaries  given  by 


b = fnB  < T^  < inA  = a , 

where  the  test  is  terminated  if  T^  > a or  T^  < b and  another  sample  (or  samples)  is  ac- 
cumulated if  T is  between  a and  b. 
n 

It  must  be  emphasized  that,  once  the  quantiles  are  known  (estimated),  Eq.  (1)  is 
independent  of  the  underlying  noise  distribution  under  both  hypotheses.  By  assuming  a 
Lehmann  and  shift  of  the  mean  alternative,  P^^^  was  defined  in  Ref.  1 respectively,  as 

1+6 . 1+6 , 

Pjj^  = F(ak)  - F(ak-l)  (Lehmann) 

Pjj^  = F(aj^  - Aj)  - F(aj^  j - Aj)  (shift  of  the  mean) 

where  6j  and  Aj  represent  the  selected  parameter  based  on  a priori  information  (say, 
minimum  SNR  of  interest).  It  will  be  convenient  in  subsequent  developments  to  use  0, 
0j  as  parameters  when  discussing  general  properties  of  the  SPD  pertaining  to  both 
alternatives . 


B.  Structure  of  Dependent  Sequential  Partition  Detectors 

Assume  • • • ,x^  is  a sample  from  a stationary  q-dependent  random  process 


r 
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with  c.d.f.  Ftxj.x^,  • • • ,x^).  Let  a = (a^=  -w.  • • • .a^_j,  = +oc)  be  an  ordered 

vector  of  quantiles,  estimated  with  independent  samples  and  c.d.f.  F(x),  subdividing 
the  real  axis  intom  cells.  (See  References  9 and  10.) 


Define  the  random  variable 
N 


o m 


"^jN  -J  il^k"ik  j=l>2,---,n  , 

o i k 


(2) 


where  N^  > q represents  the  number  of  subsamples  (x^jx^^j^, 
decision  interval. 


f X. , ) summed  in  a 

3+N^' 


Using  the  data  sequence  (x^jX^,  • • • the  dependent  sequential  partition  detector 

(DSPD)  is  constructed  to  test  two  hypotheses  H , H.,  with  stopping  boundaries  (fnB, 


fnA).  Then,  for  independent  T (j  = 1 , 2,  • • • , n). 


’'n  “ I V 

j ^ 


(3) 


and  sampling  continues  if  fnB  < T < fnA  and  H , H,  is  chosen  if  T < InB,  T > fnA, 

n oi  n*—  n — 

respectively. 

The  basic  structure  of  the  DSPD  is  shown  in  Fig.  1,  where  the  quantiles  a are 


estimated  under  with  independent  samples.  Once  the  quantiles  are  known,  the  scores 


k=  1,2,  • . • ,m  can  be  found  if  the  functional  form  of  the  alternative  is  specified. 


q-  Depondeiil 
I ’roi  os  > 


Fig.  I.  Structure  of  DSPD. 

Let  the  N^-dimensional  moment  generating  function  of  T^^^  be  defined  as 


'N 


( t ) = E 


N 


y t.  T. 

fj  I I 


Le‘ 


(4) 
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where 


m 

T.  = b.  n..  . 

i — ' k ik 

k 


It  is  convenient  to  use  the  cumulant  expansion  for  (t)  (see  Reference  2). 


(O).  ^tjE(T,)t  1/2  V ■ 

O i i.L 


where 


and  0(/i)  represents  terms  greater  than  second  order. 

Since  T...  represents  a sum  of  T. , 

J^o 

^N  ^ ^ ^ ^ j f 2 ” ’ ’ ' ’ ~ ~ ^(f ) 


The  cumulant  expansion  reduces  to 


inC^(t)]=  tE(T  )+ 

n^oc 


Given  the  moment  generating  function,  (&{t)  = E(e^  ^ a value  of  t^=  ^ 


be  found  so  that  ^(t^)  = 1,  where 


toO)  = t^  = -2 
n^x 

Oj  0 


For  the  class  of  alternatives  considered,  0(/x)  0 as  the  selected  SNR  0^  ->  0 and 


The  mean  of  T.^  is  given  by 
o 


u xij  HI 

aV  ' = as  E = No  I ' n„eit,i  . 


o i k 


and  the  variance  of  T.^  by, 

o 
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T 

JN, 


ik' 
N N 


(6) 


'No  X^'>LE|,.j|(^nj)-N„'E2(T,l  , 


i *) 

Where  it  was  assumed  that  satisfies  the  strong  mixing  condition  (Rosenblatt^), 

the  joint  expecatation,  "j*lJ  “ ^^”jL  ^'ik^’  symmetric  for  all  i,  j,  k and  L, 

and  the  quantiles  remain  fixed  during  a decision  period.  Also,  in  Eq,  (6)  the  depend- 
ence on  the  sampling  rate  was  expressed  as  subscripts  |i-j  |. 

The  limiting  value  of  Eq.  (6)  is  found  by  letting  m — so  . The  second  term  can  be 
asymptotically  expressed  as 

V V 

lim  > ) b b E|-  . |(n,  n.) 

m-^oo  t t ^ ^ - Jl'  k j' 


^^®k-r  ^L-i^  ■ ^^^k’  ^L-i^  ■ ^^^k-r  1 ^ 

30  ^ 

f lx  ^ 8 |i.j  ^ |i.j  a (y)), 

where  A (.  ) = fn  (g  (.  )/f(.  ))  represents  the  asymptotic  expression  for  the  scores  b^^, 
bj^,  G(aj^,  a^^)  is  the  Joint  c.d.f.  under  the  alternative. 

The  limiting  forms  for  the  mean  and  variance  of  T^  are  given  by 

lim  E (T.)  = E [A  (x)] 
m -»so 

and 


J‘Z  '’t'  = '='V' 
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By  substituting  the  asymptotic  forms  into  Eq.  (6)  the  asymptotic  variance  of 
reduces  to  * 


2 o- 1 


■jN, 


= ''o‘^A  — Z,  (N^-q)PA(q)l. 


o q = l 


where 


N 


t I “A  '‘-j'  = 2 I,  '^o  - >!> 

I j q=  1 


and 


i - j I)  = 


E|._.  |(A(x)  A(y))  - E^(A(X)) 


is  defined  as  the  asymptotic  correlation  coefficient. 

For  finite  m,  the  partition  correlation  coefficient  is  defined  as 


RT(q)  = 


5 '’k  "l-  ^'"k  -l’  • •’■l’ 


(7) 


T. 


where 


Eq<«k"L>  = <^q<^k-  + % < V 1 ’ ^L- P 


- ^L-l)  - ^q^Vl'  • 


Then, 


2 2 
a =a 

J^o  ^ 


N _j 

No(l+  jr  t (N^  - q)  R^lq))  . 


o q 


For  a Gaussian  distribution  with  input  correlation  coefficient  vjj  (q)i  Eq.  (7)  can 
be  simplified  further  by  expanding  G(aj^,  a^^)  and  integrating  each  term  over  v];  (<l) 
(see  Ref. 2,  pp.  290).  The  result  is  given  by 


^qK  V = ^k  + «<Vl-  ^L-l>  - 

^L-1^  ■ ®^^k-i’  (q>  • 


(8) 


•»■  *!*•  *«* 
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where 


g(a  , a.  ) = — i-j-  ^ ^ 

^ ^ 2ira^  /TT^^ 


CT  [1  -q;2  ] 


^ ^ shift  of  the  mean  alternative. 


Example  1 


2 1 
Let  m = 2,  A = 0,  b^  = - b^.  cr  =1,  and  the  optimum  quantiles  are  given  by 


a = - 00 
o 

ai  =0 


^2  = + ^ 


Then,  Pj^  = = 1/2 


4j  1 

(n,  n,)  = E (n^n,)  = 1/4+  f r 

> ^ ‘ '5  2 2 ^0  2^  /T— 4;2 


= 1 /4  + — — arcsin  4^  (q)  , 

2tt 

E^(nj  n^)  = E^in^  nj ) =1/4-  arcsin  4;  (q) 

and  the  partition  correlation  coefficient  reduces  to 

2 

(q)  = — arcsin  (q). 

1 7T 

Equation  (8)  was  evaluated  using  numerical  analysis  techniques  for  m = 2,  4 and  6; 

A = 0,  1 and  0 < ijj  (q)  < 1 . 

The  results  are  given  in  Figures  2 and  3.  Note  that  4;  (q)  > R^(q)  for  all  m, 

and  as  m -*  30  R^  (q)  — ijj  (q)  very  rapidly.  Therefore,  from  these  results  for  a 

2 

Gaussian  process,  it  can  be  concluded  that  p ^ (q)  gives  an  upper  bound  for  • 

Unfortunately,  for  other  distributions  the  evaluation  of  Eq.  (7)  is  difficult,  since 
a two-dimensional  integration  is  required.-  However,  using  a series  expansion^  and 
assuming  small  4^  (q)  for  the  Rayleigh  process  under  a Lehmann  alternative,  it  was 
shown  that  Py^(<l)  ^ Rp(q)  for  m = 2 and  4. 
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From  the  above  results,  and  since  lim  R_(q)  — »p  . (q),  it  will  be  assumed 

m 

(without  proof)  in  subsequent  derivations,  that  ^he  subsequent 

results  hold  for  distributions  for  which  this  is  true.  ™ 

Let  t(6)  represent  the  value  of  t = t^  ^ 0 for  which  (f  (t^)  = 1,  under  independent 
sampling,  then,  for  q-dependent  sampling,  the  value  of  t = t^  = t^^  ^ 0 as  m — oo  is 
given  by 


to  =tD(e)  = 


1 + ^ Z (N^-q)  p^(q) 

o q 


where 


t(e)  =-2 

Gl-O 


E(T.) 


is  given  in  Reference  1 . 

Notice  the  effects  of  dependence  (rapid  sampling)  only  show  up  in  the  solution 
of  (i  (t)  = 1.  From  the  fundamental  identity^  the  probability  of  detection  becomes 


P(e)  = 


tj^(e)  b 

1 - e 

tj^(e)a  tjjO)  b 

e - e 


Also,  from  Wald,  "neglecting  the  excess  over  the  boundaries,  " the  thresholds  are 
given  by. 


a = fn 


and  b = fn 


1 - O' 


Therefore,  in  order  to  guard  against  higher  a and  j3  due  to  dependence,  the 
thresholds  a and  b must  be  adjusted  by  precisely 


Z (N  -q)p(q)] 


Define 


a’  = [1  + if-  Z°  (No  - q)  PA(q)]  ^ 


b'  = [l+  Z (N^  - q)  Py^{q)]  b 

o q 
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as  the  new  thresholds  needed  for  a q-dependent  process  to  assure  the  same  protection 
against  a and  |3  as  in  the  independent  case. 

g 

The  average  sample  number  for  a q-dependent  process  (ASND)  with  adjusted 
thresholds  a',  b'  under  a Lehmann  or  shift  of  the  mean  alternative  becomes  then, 


2 Vl 


[l  + N"  S'  '(N^-q)  p (q))  b(e^(®'^-l)  + a(l-e 


t(e)b, 


ASND  = 


o .a. 


N E(T.) 
o i 


t(0)a  t(e)b 
e - e 


, E(T.))f  0 (11) 


ana 


2 0-1 


ASND  = 


o q 


''o4. 


, E(T.)  = 0. 


(12) 


Define  t^/t^  as  the  ratio  of  time  required  for  independent  observations  to  the 

total  time  needed  for  a decision  interval,  including  enough  delay  to  assure  independent 

samples  [T  , T , N^]  . 

-*0  ^ 

It  follows  that 


D 


N + r -1 
o s 


(13) 


where  is  the  sampling  rate.  A'  is  the  time  between  independent  observations  and 

r^  is  an  integer.  For  r^  = 1,  1 /A'  is  the  sampling  rate  for  independent  observations. 

If  the  sampling  rate  is  increased  r times,  r - 1 samples  are  skipped  after  each 

s s 

decision  interval,  T ...  . 

J^o 

Table  I gives  typical  threshold  adjustment  factors  for  a Gaussian  process  based 
on  P^(q)  a-s  a function  of  r^  and  N^. 

TABLE  I.  Threshold  adjustment  factors 
for  a Gaussian  process. 


N 

o 

r^  (sampling  rate) 

2 

3 

4 

2 

1.  2 

1.5 

3 

1.6 

4 

1.  4 

2.  0 

6 

« 

1.  9 

. 

8 

1.  5 

2.  4 

00 

1.6 

2.  2 

2.  8 
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Note 

Since  > Rj(q)i  the  adjusted  thresholds  a',  b',  give  an  upper  bound  on  the 

errors  a and  /3  . 

It  should  also  be  pointed  out  that  in  order  to  calculate  the  new  thresholds  a'.b'.in 
practical  implementation,  the  form  and  value  of  must  be  known.  This  limitation 

' could  be  removed  by  estimating  Py^(q)  at,  say,  a few  sampling  rates  and  setting  up 

[ thresholds  a',  b'  which  will  bound  the  errors, 

I o'  < o 

: (3'<  0 

I as  long  as  the  predicted  dependence  is  not  exceeded. 

There  does  not  seem  to  be  any  general  technique  which  will  eliminate  the  need 
to  know  Py^(q)  m all  cases. 

C.  Efficiency  Calculations  for  Dependent  Sequential  Partition  Detectors 

Let  N (block  size)  represent  the  number  of  samples  summed  in  a decision  in- 
terval . If  the  sampling  rate  is  such  that  r^  = N^,  there  can  be  no  increase  in  ASND 
solely  from  accumulating  the  samples  in  blocks.  The  reason  for  this  is  that  all  de- 
pendent samples  fall  into  the  independent  block  size  Nj,  neglecting  for  the  moment 
the  additional  delay  needed  between  dependent  blocks.  However,  if  N^  > r^  N^  , then 
grouping  the  samples  in  blocks  can  only  increase  ASND,  except. in  a special  case. 
Suppose  n^  is  the  number  of  groups  needed  to  terminate  a test  when  the  samples  are 
independent  and  blocks  are  of  size  N^.  If  n^  is  an  integral  multiple  of  N^,  then 
grouping  in  blocks  of  N when  samples  are  dependent  will  not  in  itself  increase  ASND. 
From  the  discussion  in  the  last  section,  dependence  effects  could  be  removed  from 
P(e)  by  increasing  the  thresholds  to  a'  and  b'.  In  the  subsequent  discussion,  assume 
this  has  been  done. 

^ In  general,  the  amount  of  protection  afforded  by  sampling  in  blocks  depends 

upon  N^,  m,  and  0.  Since  6^  can  be  made  as  small  as  desired,  and  usually  N^  is 

j bounded,  P(0)  will  not  substantially  change  (0j  small)  when  sampling  in  blocks.  There 

fore,  the  probability  of  detection  P(0)  when  making  efficiency  calculations  will  be  as- 

I sumed  constant.  Also,  for  sufficiently  small  0,(n  — =c  ),  the  effect  of  grouping  on  the 

6 ^ 

' ASND  is  negligible. 

In  order  to  compare  the  relative  efficiencies  of  two  test  using  dependent  sam- 
pling, the  ASND  will  be  based  on  the  time  it  takes  test  T j to  terminate  compared 
with  test  T 2' 
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Let  tj  and  represent  the  time  between  data  samples  for  SPD  and  DSPD,  re- 
spectively. Then,  the  efficiency  of  DSPD  with  respect  to  SPD  is  defined  by 


ASN  . t. 


Efficiency  = = 


ASN  r 


ASND  . t„  ASND  N + r - 1 
D os 


where  Eq.  (13)  was  employed  using  an  independent  sampling  rate  of  1 /A'  . 

1.  Lehmann  Alternative  Case 

Let  N = r and  define,  for  fixed  a 3 , 
os 


L Dm 


ASNLdo  r 

8 

ASNDL  N +r  -1 
o s 


m [F(a|^)  fnF(aj^)  - F(aj^_j)  fnF(aj^_j) 


[1+2/N^^  (N^-q)  p^(q)] 

q 

where  Tj^^  is  defined  as  the  optimum  test,  m -►  » , and 
Urn  m [F(aj^)  F (a^^)  - F(aj^_j)  fn  F(aj^_j)]‘‘ 


N r 
o s 

N +r  -1  ’ 
o s 


F(a^).F(a^,l) 


Then, 


m -r<»  T,_  , ..0-1 

L LDm  J [1+^2.  (N^-q)Py^(q)] 


N r 

O 8 

N +r  -1 
o s 


Now,  let  r -►»  , N -*00 
s o 


m -►  * LDm 

r -*  30  L 
s 

N„ 

o 


— z pA^q)  PA(^^)  dp 


Kassam  and  Thomas  obtained  the  same  result  for  a fixed  sample  sign  test  in 
Gaussian  noise.  However,  the  above  result  also  applies  to  dependent  sequential 
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partition  detectors  and  is  independent  of  the  distribution  under  the  Lehmann  alterna- 
tive, and  for  m > 2 represents  a generalization  of  the  sign  (m  = 2)  test. 

If  N > r and  r is  fixed,  the  above  results  apply  as  long  as  N <»  , and  if  it  can 
be  assumed  that  as  n -»30,  0 — 0,  effect  of  grouping  is  negligible. 

Example  2 

g 

Let  the  noise  distribution  be  a Gauss-Markov  process.  Then, 

-k/A'  1*^1 

P/y(^^)  = e 

and  for  independent  samples  p = A',  and  if  k = 3,  the  correlation  between  two  adjacent 

samples  is  .0  6.  Then,  from  Eq.  (14) 

T,  -> 

Lao 


& j 1 —1.5 

m— 30  LiDm 

r — » 
s 

No-^ao 

which  is  a significant  Improvement  in  efficiency  for  the  DSPD  , 

Figure  4 compares  efficiencies  for  m = 2 and  m = 8 when  r^  = 2 and  N^  is  in- 
creased. The  curve  for  m = 8 depicts  the  improvement  in  efficiency  over  the  sign 
(m  = 2)  test  under  this  condition.  The  improvement  in  efficiency  for  m = 8 over  the 
sign  (m  = 2)  test  as  N^  -*  ao,  ao  , approaches 


r -*ao 
s 

N — ao 
o 


L2 


LD8 


2.  2 


'LOm 


Fig.  4.  Efficiency  of  DSPD  vs  SPD(m  = 2)  . 
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2.  Shift  of  the  Mean  Alternative  Case 


Let 

1 ASNS» 

(5  1=  

T ^ I ASNDS 
1 sDm  J 


r 

s 


N + r - 1 
o s 


be  the  efficiency  for  the  DSPD  compared  to  the  optimum  (m  — »)  test  under  a shift  of 
the  mean  alternative  when  a and  j3  are  fixed. 

Then, 


6 


S» 


sDm 


m [f(a^_j)-f(a^)]' 


F(a^)-F(a^,j) 


N 


[ »+  IT  ^ \N^-q)p^(q)]  / f(x)  dx 


N r 
o s 


N + r - 1 
o s 


o q 


and  as  m — XI 


r T 


m-*-» 


SOO 


sDm 


N r 
o s 


N 

- _o-l  N + r - 1 

2_  V'  /•VT  /_V1  O S 


o q 


Also  assuming  a Gauss-Markov  process  as  in  Example  2,  the  efficiency  approaches 


m—x) 

r -►X) 
s 

N —00 
o 


SOO 


sDm 


— 1.5 


as  in  the  Lehmann  case. 

Note  that  F{x)  must  be  specified  in  the  shift  case  except  when  m — oo.  In  gener- 
al, the  efficiencies  for  two  non-optimum  (m  < oo  ) tests  can  be  compared. 


sm, 


sDm^  J 


”2 


N r ■ 
o s 

N r - 1 
o s 


N _j  m 

[1  + (N^-q)PA(q)]  ^ 


“l  [f(aj^_l)-f(a^)]^ 


F(a^)-F(ak.i) 
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D.  Conclusions 

The  effects  of  dependent  sampling  on  sequential  partition  detectors  (SPD)  were 
considered  in  this  report.  First,  the  theory  of  SPD  was  extended  to  include  q-dimen- 
sional  processes  by  introducing  a linear  transformation  from  a q-dimensional  space 
to  a one  dimensional  space.  Then  using  Wald's  fundamental  identity  the  probability  of 
detection  and  average  sample  number  for  the  DSPD  were  derived.  It  was  shown  how 
to  adjust  the  thresholds  (a,  b)  in  order  to  obtain  the  same  error  probabilities  as  in  the 
independent  sampling  case.  Next,  for  the  adjusted  thresholds  (a',  b')  efficiency  cal- 
culations were  made  under  Lehmann  and  shift  of  the  mean  alternatives.  The  interest- 
ing results  were  that  both  alternatives  gave  improved  efficiencies  under  dependent 
sampling. 
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SEQUENTIAL  PARTITION  DETECTORS  IN  LARGE  SIGNAL  AND  IMPULSIVE  NOISE 
ENVIRONMENTS 

R.F.  Dwyer  and  L.  Kurz 

In  the  previous  report,  the  concept  of  partition  detection  was  extended  to  include 
1 2 

sequential  detection  and  certain  types  of  dependence.  In  this  report,  the  theory  of 
sequential  partition  detectors  is  extended  to  include  large  signals  and  impulsive  noise. 
The  notation  and  definitions  of  Refs.  1 and  2 are  used  in  this  report. 

The  cases  of  large  signals  in  noise  and  small  signals  in  impulsive  noise  will  be 
treated  separately.  In  most  practical  situations,  these  are  the  cases  of  interest.  How- 
ever, there  can  be  some  degree  of  overlap,  i.e.  , moderate  signal  strengths  in  impulsive 
noise,  without  having  to  redesign  the  sequential  detector. 

The  performance,  on  the  other  hand,  is  difficult  to  predict  since  it  depends  upon 
the  severity  of  the  impulsive  noise,  its  frequency  and  time  of  occurrence.  For  example, 
if  an  impulse  occurred,  when  the  statistic,  T^,  was  near  one  of  its  boundaries  (a,b), 
with  sufficient  strength  to  terminate  the  test,  large  errors  can  be  expected.  However, 
if  T is  between  its  boundaries  when  an  impulse  occurs  and  the  test  does  not  terminate 
immediately,  it  is  not  unreasonable  to  assume  that  for  a sufficient  number  of  samples, 
without  an  impulse  occurring,  the  test  could  respond  and  make  the  correct  decision  with- 
in its  designed  error  probabilities. 

From  this  example,  it  is  quite  evident  that  sequential  tests  are  more  sensitive 
(i.e.  , can  make  the  wrong  decision)  to  impulsive  noise  than  fixed  sample  tests,  since, 
usually,  a fixed  sample  test  will  have  time  to  "average  out,  " the  impulsive  interference, 
with  everything  else  considered  equal. 

Also  notice,  that  the  partition  structure  itself  is  somewhat  insensitive  to  impulsive 
noise.  That  is,  for  a particular  partition,  say  m=  2,  all  interfering  noise  levels  are 
given  the  same  value,  not  unlike  a hard  clipper,  and,  depending  upon  the  design,  each 

3 

impulse  contributes  only  a small  amount  to  the  total. 

In  the  next  section,  sequential  partition  detectors  are  designed  to  operate  in  large 
signal  environments  where  the  noise  is  undefined  except  that  the  quantiles  will  be  assum- 
ed known.  The  basic  equations  for  sequential  analysis  are  still  valid  for  this  case, 
however,  the  approximations  made  in  finding  a solution  to  the  moment  generating  func- 
tion in  the  preceding  work  (Refs.  1 and  2)  are  not.  Also,  neglecting  the  excess  over 

the  boundaries  would  not  be  appropriate  in  this  case,  since  both  |E(t.)l  and  o_  at 

i 

(0Q,0j)  are  expected  to  be  large  and,  therefore,  the  ASN  small. 

In  the  impulsive  noise  case,  a reduced  partition  space  will  be  introduced  which 
discards  (i.e.,  assigns  weight  zero),  to  samples  falling  outside  the  space.  This  is 
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effectively  a partition  correlator  which  only  accepts  samples  falling  within  its  defined 
range.  Assuming  that  impulsive  noise  is  composed  of  high  amplitude,  large  variance 
impulses,  occurring  randomly  and  the  signals  of  interest  are  of  low  energy  in  noise  of 
small  variance  occurring  during  a decision  interval,  the  space  can  then  be  partitioned 
into  two  disjoint  regions.  A signal  space  and  an  impulsive  noise  space  will  be  defined 
in  Rj.  An  impulse  falling  into  the  signal  space  or  a signal  sample  falling  into  the  im- 
pulsive space  cannot  be  discerned  further.  The  main  concern  however,  will  be  to  re- 
duce the  sensitivity  to  impulsive  noise  using  the  same  simple  structure  of  the  SPD  under 
the  Lehmann  and  shift  alternatives.  The  price  paid  for  this  insensitivity  will  be  gener- 
ally, an  increased  ASN.  A similar  approach  was  introduced  by  Rappaport  and  Kurz  for 

4 

digital  signalling  in  nongaussian  noise  problems. 

A.  A Sequential  Partition  Detector  for  Large  SNR  Cases 

For  some  situations,  the  small  SNR  model  of  the  SPD  developed  in  the  previous 
work  may  not  be  applicable.  In  active  sonar,  for  example,  detection  of  a target  is 
desired  in  only  a few  "pings"  (samples)  when  the  energy  from  the  echo  return  is  large. 
Therefore,  under  these  circumstances,  another  approach  is  taken. 

Assume  , x^,  • • • , is  a random  sample  of  i . i . d.  r . v.  with  c . d.  f . F(x)  and  a 
is  a vector  of  constants  subdividing  the  real  axis  into  m cells  as  discussed  previously. 
Also,  let 


n 

= y; 


n m 

T.  = VTb 
t Lj 

k 


1 


k"ik 


be  the  nonparametric  test  statistic  and  n.  bj^  are  as  defined  previously. 
Recall  that  the  fundamental  identity  is  given  by. 


T t 

E(e  " 0(t)’")  = 1 


where 


m b.  t 

9>(t)  = y P.  e = 


1 


(1) 

(2) 


The  following  is  a modified  version  of  Wald's^  derivation.  Assume  the  scores  bj^ 
only  take  on  integral  multiple  values  of  some  constant  d.  Since  the  scores  are  functions 
of  the  likelihood  ratio,  this  assumption  restricts  somewhat  the  quantile  locations  and 
also  the  selected  alternative  parameter  0^.  Therefore,  for  the  Lehmann  and  shift 
alternatives,  the  efficiency  will  be  reduced  from  their  optimum  values.  Fortunately, 
this  is  not  serious,  since  some  loss  in  efficiency  can  be  tolerated  in  the  high  SNR  case. 
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However,  for  a shift  alternative  and  m = 2 under  Gaussian  noise,  no  restrictions  are 
imposed  by  assuming  the  scores  are  multiple  values  of  d.  In  general,  finding  the  com- 
mon factor  d for  m>  2 is  difficult  and  is,  therefore,  not  a very  useful  technique.  How- 
ever, for  those  cases  where  d can  be  found,  it  does  provide  a means  of  obtaining  P(0) 
and  E(n/0)  without  approximation.  Later,  an  example  will  be  given,  for  m = 2 under  a 
shift  alternative,  comparing  the  results  using  the  approximation  for  E(n/0)  and  the  exact 
technique  for  0 large. 

Again,  the  roots  of  Eq.  (2)  must  be  found.  Letu=e^,  then 


m b, 

—I  k 
k 


(3) 


There  are  m^  roots  to  Eq.  (3),  u.(0)  (i  = 1, 2,  • • • ,m^),  and  assuming  that  they  are 
all  unique,  Eq.  (1)  reduces  to 

T 

E(u.(0)  = 1,  (i=  l,2,---,m') 

At  termination,  T can  take  on  only  a finite  set  of  values  (c  ,,  c_  ,*••,  c ) which  are 
determined  from  the  modified  thresholds  (da,db)  and  the  values  of  bj^.  If,  for  example, 
m = 2,  da=5,  db=-5,  and  b ^ = -1 , b2  = + 1 , then  the  test  can  only  terminate  at  ± 5. 

Therefore, 

m ' c, 

y P(T=  c./0)u.(0)  *"=  I,  (i=l,2,---,m') 


or  in  the  determinant  form 


Uj(0) 


Uj(0) 


u 


^k-1  ^k-1 

Uj(0)  1 Uj(0)  ^ 


Uj(0) 


U ,(0)  ^ u ,{0)  ^ ^ 1 u ,(0)  •••  u ,(0) 

m''  m m"  m''  ' 


Ui(e) 


m 


u /(e) 


Then  the  probability  of  detection  is  the  sum  over  all  k for  which  Cj^  > ad, 

P(0)  = ^ P(T^=  c^/0)  , 

k 


(4) 
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E(„/9l  = ^^c^P(T„=  Ve) 


where  both  Eq.  (4)  and  Eq.  (5)  are  exact  in  that  the  excess  over  the  boundary  is  not 
neglected. 

These  equations  are  general,  but  unfortunately,  very  cumbersome  for  comparison 
purposes.  Some  simplification  in  the  evaluation  of  these  equations  was  given  by 
Girshich^  but  they  are  still  formidable  for  practical  interpretation.  A practical  techni- 
que for  evaluating  the  ASN  and  the  probability  distribution  P(n)  was  introduced  by 
Proakis?  The  SPD  is  represented,  in  this  technique,  as  a Markov  process  with  absorb- 
ing boundaries  a,b.  The  states  of  the  Markov  process  will  correspond  to  possible  values 

of  T . 
n 

First,  an  example  will  be  given,  using  Wald's  procedure,  for  the  case  when  m = 2 
under  a shift  alternative.  Next,  P(n)  and  ASN  for  SPD  will  be  calculated  for  m = 2,  4,  6 
using  Proakis'  technique  under  a shift  alternative.  Then  the  results  of  a simulation  in 
Gaussian  noise,  N(0,  1)  under  a shift  alternative,  will  be  compared  for  m=2,4,  and  6, 
to  Proakis'  technique. 

Example  1.'  Bernoulli  Trials 

Let  m = 2 and  assume  bj  = - 1 and  b2  = + 1 • Then, 


2 t 

yp^e 

k 

k 


= P.  Poe*"  = 1 

1 2 


Since  1 -Pj,  the  roots  of  Eq.  (6)  are 


'^1  ""  1 -: 


'^2=  ‘ 


At  this  point,  the  value  for  t for  which  (^(t)  = 1 can  be  found. 


t = -in(-j — p— ) 
* "1 


Then,  using  Eq.  (4)  and  Eq.  (5)  and  noting  that  Pj  = 6^(0)  the  probability  of  detec- 
tion and  the  ASN  can  be  found  from. 
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P(0)  = 


1-^ 

I 1 -F 


p,(e)  .db 


F[Teyi 


j.  p^(e)  ,db  r p^(e)  , ■ 

1— F^ey)  ; 


da 


^ Pj(e)  -^db 
• li  -Pj(0) ; 


1 - e 


, Pj(0)  ^ " ^dat(0) 

1 ■ il  -Pj(0)  J 


-bdt(0) 

-bdt(0) 


and 


E(n/0)  = 


E(T.)  ^dat(0)  _g-bdt(el 


-ba  ,2 

1 


E{T.)  = 0 


where. 


E(Ti)  = Eb  P =J-2P(6) 
k 


and 


E(T.)  4 0 


(7) 


(8) 


2 

i k 

Note  the  constant  d was  only  expressed  in  the  thresholds. 

Figure  1 compares  the  ASN  and  the  power  function  P(A),  using  Wald's  method, 
with  the  SPD  estimate  for  a selected  alternative  of  = 1 and  m=  2.  The  SPD  estimates 
of  ASN  and  P(A)  agree  very  well  with  the  exact  values  calculated  by  Wald's  method 
over  the  range  0 < A Figures  2,  3 and  4 show  the  exact  probability  distribution 

' > ' jr  A * . 5,  m = 2,  4 and  6;  A = 1 , m = 2,  4 and  A = 2,  m = 2,  4 and  6,  respectively.  The 
I ASN  wa»  obtained  from  the  figures  by  calculating  the  probability  of  terminating 
’ ' h n.  Table  1 compares  the  ASN  obtained  from  Froakis'  exact  method  and 

' «wred  from  a simulation  of  the  SPD  model  in  Gaussian  noise,  N(0,  1),  with 

It  s concluded  from  the  results  in  the  figures,  that  the  approximation 
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TABLE  I.  ASN  comparison  of  exact  and  simulation  results  with  the 
SPD  estimate  for  Gaussian  noise,  N(0,  1),  under  a shift 
alternative  and  a = p < . 0067. 


m 

Exact 

A = 

Sim . 

. 5 

SPD  Estimate 

Exact 

A = 

Sim . 

1 

SPD  Estimate 

2 

64.  3 

59 

62.  5 

18.  1 

17 

15.  3 

4 

45.7 

43.  3 

44.  7 

12 

11.4 

11.2 

6 

40.  9 

35 

41. 9 

10.  8 

10.  5 

10.  5 

m 

Exact 

A=  2 

Sim . 

SPD  Estimate 

2 

5.  7 

5.7 

3.9 

4 

3.  3 

3.  2 

2.8 

6 

3 

2.  8 

2.6 

Exact  - Proakis'  Technique: 

Sim.  - Computer  Simulation  using  SPD  Model; 

SPD  Estimate  - Values  of  ASN  predicted  from  the  SPD  equations. 

technique  for  the  SPD  predicts  the  performance  parameters  exceedingly  well,  even  for 
moderately  high  SNR. 

B.  A Reduced  Partition  Space  to  Combat  Impulsive  Noise  Interference 

As  before,  assume  , x^,  • • • , is  a sample  of  i . i . d.  r . v.  with  c.  d.  f.  F(x)  and  a 
is  a vector  of  constants  estimated  under  nonimpulsive  noise  conditions  subdividing  the 
real  axis  into  m cells.  However,  now  the  space  does  not  extend  to  ioo  but  is  reduced, 
i.e.  , F(a^)  = c^  and  F(a^)  = Cy.  In  general,  the  quantiles  can  be  estimated  using 
robustized  estimation  techniques  which  do  not  require  nonimpulsive  noise  conditions  for 

g 

estimation.  In  the  one-sided  Lehmann  case,  c^  ^ 0 and  the  space  is  only  restricted  to 
the  right.  The  two-sided  shift  case  requires  a reduced  partition  at  both  ends  of  the 
distribution.  Samples  falling  beyond  c^^  and  c^  are  discarded  and  not  considered  in  the 
sequential  test.  In  this  way,  the  SPD  becomes  insensitive  to  high  amplitude  impulse 
noise  interference.  Since  the  amplitude  and  frequency  of  occurrence  of  impulsive  noise 
are  assumed  random,  the  reduced  partition  structure  represents  the  best  compromise 
in  suppressing  the  contribution  from  interference  and  maintaining  the  simple  structure 
of  the  SPD.  However,  the  amplitudes  of  the  impulsive  noise  are  distributed  throughout 
the  space  and  some  portion  of  the  samples  will  fall  within  the  range  of  the  reduced 
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partition.  Therefore,  the  reduced  partition  structure  acts  much  like  a partition  corre- 
lator or  matched  partition  filter  being  sensitive  to  samples  only  falling  within  its  defined 
range . 

Notice  that  now  the  SPD  is  independent  from  the  severity  of  the  impulsive  noise. 

High  amplitudes  are  discarded  and  low  amplitudes  are  treated  as  nonimpulsive  noise  but 

3 8 9 

are  of  no  consequence  due  to  the  insensitivity  of  the  partition  structure.  ’ ’ 

The  question  to  be  considered  now  is,  how  does  the  reduced  partition  structure 
affect  the  performance  of  the  SPD  in  nonimpulsive  noise  environments? 

Define  n^^  as  the  number  of  observations  falling  into  the  interval  I j<  x < 
k=  l,2,-*-,m;  i = l,2,**-,n]. 

Let  P , = P [a,  < X.  < a,  ] = F (a,  ) - F (a,  ,),  (v  = 0,  1)  represent  the  probabil- 

vk  V k-p  i — k V k V k-1  \ t / t-  t- 

ity  of  a sample,  x^,  falling  into  the  k-th  interval  under  the  hypothesis  or  alternative 
under  the  constraint  that 


m m 

^ Pvk=  '^L'  and  ^nj^=n  . 

Following  the  procedure  of  Ref.  17 


n m 

= S S \"ik 

i k=0 


(9) 


Notice  that  now,  two  additional  quantiles,  F(a^)  = Cj^  and  F(a^)  = Cy  must  be  esti- 
mated in  establishing  Equation  (9). 

Assuming  the  same  conditions  hold  as  in  Ref.  1,  the  mean  and  variance  of  T^  must 
be  found  to  obtain  the  solution  for  the  moment  generating  function. 


Then, 


k=0 


m 


®1  Z ’ 2 ^ 

k=0 


p 

k=0  ok 


(10) 


(11) 


where 
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m m 

£ A = V (F(aj^)ln  F(aj^)  - F(aj^  ^)in 

k=0  k=0 

= F(a^)in  F(aJ 

+ F(a^)in  F(aj)  - F(a^)  in  F(aJ 


I 


I 


I 

i 


! 

i 


t F(^^)tn  F(a^)  - F(a^.,)  Jn  F(a_„.j) 

= 'u'"'u  • 

The  solution  for  t^(e)  takes  a rather  complicated  form, 
t^(e)  = (1  - 2e/9j)  - ZlCyin  Cu)/0J  V^ok 


(12) 


However,  usually  Cj^  0 and  Cy  1 and  Fqs.  (10),  (11)  and  (12)  reduce  to  those  given 
in  Reference  1.  For  example,  under  a two-sided  shift  alternative,  for  c^  = ,0001  and 
Cy  = . 9999,  there  is  only  a slight  change  in  the  performance  parameters  and  these  dif- 
ferences can  be  neglected  for  practical  considerations. 

In  order  to  compare  the  performance  under  impulsive  noise  interference,  a simu- 
lation was  conducted  for  the  shift  case.  The  quantiles  were  chpsen  under  Gaussian 
noise,  N(0,  1),  and  fixed.  The  interference  was  also  Gaussian,  but  with  large  variance, 
N(0,  10),  which  occurred  20  percent  of  the  time.  Table  II  compares  the  ASN  and  the 
number  of  errors  made  in  40  trials  for  both  the  normal  and  reduced  partition  space  with 
_ 0001  and  Cy  = .9999.  The  space  was  partitioned  into  two  disjoint  regions;  from 
-3,  62  to  3.  62  represents  the  signal  space,  and  from  -«>  to  -3,  62  and  from  3.  62  to  +oc 
represents  the  impulsive  noise  space.  Two  signals  (A  = = .5  and  A = Aj  = 1)  were 

considered  in  the  simulation  for  m = 2,4  and  6.  For  all  cases,  the  reduced  partition 
space  rejected  the  interference  and  made  the  correct  decision.  It  is  evident  that  the 
sensitivity  (measured  in  errors  made)  to  impulsive  interference  increased  with  m under 
a normal  partition  structure;  whereas  the  reduced  partition  structure's  sensitivity  re- 
mained constant  at  the  expense  of  higher  ASN. 

It  should  also  be  pointed  out  that  for  one-sided  alternatives,  impulsive  noise  is 
more  troublesome  and  the  reduced  partition  space  structure  is,  therefore,  more  effective 
in  combating  interference. 
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TABLE  II.  A simulation  of  a SPD  in  Gaussian  N(0,  1), 
and  Impulsive  Gaussian  N(0,  10)  noise. 


1 Normal  Partition  Space 

A 

= . 5 

A 

= 1 

m = 2 

ASN 

Errors 

ASN 

Errors 

No  Impulsive 
Interference 

59.  1 

0 

16.  95 

0 

Impulsive 

Interference 

67 

0 

20.  45 

0 

A 

= . 5 

A 

= 1 

II 

6 

ASN 

Errors 

ASN 

Errors 

No  Impulsive 
Interference 

44.  28 

0 

12.  58 

0 

Impulsive 

Interference 

47.  1 

1 

13.  5 

1 

A 

= . 5 

A 

= 1 

m = 6 

ASN 

Errors 

ASN 

Errors 

No  Impulsive 
Interference 

40.  25 

0 

10.  65 

0 

Impulsive 

Interference 

41. 65 

0 

10.  95 

3 

Reduced  Partition  Space 


m = 2 

A 

ASN 

= . 5 

Errors 

A 

ASN 

.=  1 

Errors 

No  Impulsive 
Interference 

59.  1 

0 

16.  95 

0 

Impulsive 

Interference 

68.  55 

0 

21. 55 

0 

m = 4 

A 

ASN 

= . 5 

Errors 

A 

ASN 

,=  1 

Errors 

No  Impulsive 
Interference 

44.  28 

0 

12.  58 

0 

Impulsive 

Interference 

56.  65 

0 

15.  35 

0 1 

3 

11 

A 

ASN 

= . 5 

Errors 

A 

ASN 

= 1 

Errors 

No  Impulsive 
Interference 

40.  45 

0 

10,65 

0 

Impulsive 

Interference 

48.38 

0 

14.63 

0 

(a)  Normal  partition  space,  q=P=.0067  (b)  Reduced  partition  space,  Cl=.0001, 

Cy = .9999,  a=  p= .0067 

Note:  Impulses  occurred  20%  of  the  time. 
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SCORE  ESTIMATION  FOR  THE  GENERALIZED  QUANTILE  DETECTOR 
J.  M.  Habib  and  L.  Kurz 

* 1 
The  class  of  generalized  quantile  tests  T.  was  described  in  the  previous  report. 

In  this  report,  a recursive  procedure  for  score  estimation  of  the  generalized  quantile 

test  is  introduced  which  is  based  on  the  Kiefer -Wolfowitz  procedure.  In  particular, 

the  scores  are  selected  recursively  so  that  the  efficacy  of  the  detector  is  maximized. 

The  procedure  is  easy  to  implement  and  operates  on  the  data  samples  directly. 

A.  The  Detector 

Consider  n^^  i.  i.  d samples  from  the  distribution  Fj^,  L = 1,  . . . , M.  Form  the 
empirical  distribution  functions 


(x)=n  ^ u (x-x  ) 

j = l J 


in. 

i 


(1) 


where  u(.  ) is  the  unit  step  function  (continuous  from  the  right)  and  x^.  is  the  j 
sample  from  F..  Define  the  weighted  sum  of  the  empirical  distributions 


•th 


1=1  I 


(2) 


where 


Then 


Pi  =1.  Pi>0  . ^ n.  =N 


(3) 


(4) 


where  v (o)  is  a bounded  measure  on  0,  1.  Equation  (4)  corresponds  to  a general  re- 
presentation for  a broad  class  of  nonparametric  statistics  as  shown  in  Reference  1. 
Let  V be  the  measure  which  assigns  weight  Cj  to  the  points 


0<X,<,.,  <X  < 1 
1 m 


Define 


^ c.  w.  ( X.) 

‘ j=i  J ‘ J 


I “ 1,...,  ^4 


(5) 


where 


-1, 


W.  (X.)  = F.  r H„  ( X.  )1 
I J in.  ‘ N J ’ 


(6) 
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and 

C.  are  the  scores  of  the  test 

i 

M is  the  number  of  thresholds 

1 ^ 

It  was  also  shown  in  Ref.  that  the  efficiency  of  the  test  is  high  for  small  m, 
indicating  that  the  penalty  of  using  simple  tests  is  small  compared  to  the  advantage 
gained  in  robustness. 

The  efficacy  of  a T test,  6 (T)  is  defined  by 


€ (T)  = [ 


3 0 


E (T) 


0=0 


I'^/var  (T) 


0=0 


where  0 is  a parameter  defining  separation  between  Fj  and 
For  the  T.  test,  Eq.  (7)  reduces  to 


e (T.)  = ^ 


C^BC 


c^Zc 


where 


C = (C, 


C ) 
m 


is  the  vector  of  scores 

and  B,  Zq  mxm  symmetric  matrices  with  elements 


(7) 


(8) 


(9) 


b 


jk 


3W.(\.) 

t J 

30 


0=0 


3 0 


0=0 


(10) 


a.,  = cov.  { W.{\.),  W.(X  ) } (11) 

jk  i j i K 

for  0 = 0,  Eq.  (11)  reduces  to 

,.k  = (XjVX^)[l-()ljAX^)l  (121 

where 

{ a V b)  = min  { a,  b}  (13) 

and 

(aAb)  = max  { a,  b } (14) 

It  was  shown  in  Ref.  \ that  under  certain  conditions,  the  scores  which  maximize 
Eq.  (8)  and  the  corresponding  efficacy,  may  be  calculated  by  forming  a linear  trans- 


formation: 
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(Tj)  = yJ(  S 


1/2 


■1/2  ^ _ 
'^N 


(15) 


where 


and 

— T — 

Y Y = 1 

and  that  the  efficacy  is  equal  to  the  largest  eigenvalue  of  B ^ 
scores  are  the  eigenvector  corresponding  to  that  eigenvalue  , 

B.  A Recursive  Approach  to  Score  Estimation 


(16) 

(17) 

and  the  optimum 


In  this  section,  it  is  proposed  to  estimate  the  optimum  scores  recursively.  The 


efficacy  function  € (C)  is  a convex  function,  with  a maximum  at  C 

, opt. 

any  score  vector  C corresponds  to  some  p.  d.  f.  f(.  ),  we  can  see  that 


Noting  that 


F(Q  = E [ e |C  ] (18) 

is  a convex  function  with  a maximum  at  C , . 

opt. 

The  optimum  scores  corresponding  to  the  maximum  of  Eq.  (18),  are  esti- 

mated recursively  using  the  generalization  of  Kiefer- Wolfowitz  stochastic  approxima- 
tion method  to  the  multivariate  case.  The  iterative  procedure  to  be  used  will  be  des- 
cribed next  and  the  necessary  notations  will  be  introduced. 

The  m scores  of  the  test,  C,,  C , . . . , C will  be  denoted  by  the  m-dimensional 
^ L c,  m 

vector  C . 

A unit  setting  of  the  score  C.^  and  zero  setting  of  the  other  m- 1 scores  will  be 
denoted  by  the  unit  vector 


e. 

i 


i = 1,  2,  . . . , m 


(19) 


The  iterative  adjustment  will  be  made  by  making  observations  on  the  efficacy 
function  e (C)  at  various  score  vector  value  settings  as  follows; 


Suppose  that  we  are  at  the  setting  C . We  then  make  2m  observations, 
1 _ 2 2m  ” 


€ , € 
n n 

where 

I 

c 

n 

setting  at 

S' 


f • • • » 


= e(C  + a e.)isan  observation  with  score  vector 
n n i 


= C -t  Or  e. 
n n i 


(19a) 
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and  e = e (C  - o e . ) is  an  observation  with  score  vector  setting  at 
n n n i 


C = C ~ a e 1 
n n 1 


= e (C^+o-^  e.  ) 


^ th 

We  then  form  the  m-dimensional  vector  , whose  i component  is  given  by 

2i- 1 2i  _ _ 

e -e  =e(C+ae.)-€(C-oe.) 
n n n ni  n ni 


i = 1 m 

The  next  score  setting  in  the  iterative  procedure  is 


C , = C - — e 

n+ 1 n a n 

n 


where  { y } and  {a  } are  sequences  of  positive  numbers  satisfying 
• n n 


(i)  Lim  Q =0 

n— 00 


00 


m)  ^ 


/r  , -1  -1/3  V 

(for  example  y = n , o = n ). 


Under  the  condition  that  there  is  a unique  maximum  of  F (C  ) , Blum  showed 
the  convergence  of  to  almost  surely  (a.  s. ) with  the  proper  choice  of 


and  Y , namely 
' n 


P r lim  S’  = S’  . ] = 1 

I n — «>  n opt.  ' 


«*>  M 
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C.  Numerical  Results  and  Examples 

To  test  the  score  estimation  procedure  outlined  above,  it  was  applied  to  the  ex- 
amples with  known  numerical  results.  These  examples  together  with  the  results  using 
the  original  and  the  developed  method  for  score  estimation  will  be  considered  next. 

The  results  using  the  developed  method  are  calculated  using  a computer  program  to 
simulate  the  multidimensional  Kiefer-Wolfowitz  s.a.  This  program  is  included  in  Ap- 
pendix D of  Reference  3. 

Example  1 

For  a Gaussian  shift  parameter  and  m=3,  it  was  shown  that 


Xj  =.165 
X^  = .835 

The  optimum  score  vector  is  given  by 


'opt. 


= [ . 605  .518  . 605  ] 


and  the  corresponding  maximum  efficacy  is 


e = .883 

max 


The  developed  stochastic  approximation  method  was  applied  as  follows:  Assume  an 
arbitrary  C 


C = C ^ + A 
opt 


where 


A is  an  arbitrary  deviation  vector 

■S  .[ 

Aj  and  A^  are  chosen  arbitrarily  and  A^  is  obtained  from 


— —T 
C C = 1 


The  results  obtained  are  as  follow 
for 


^1  = 

1 

o 

^2  = 

1 

o 

^3  = 

. 137 

and 


r 
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. 2 


The  initial  deviation  vector 

"S  = [ -0.  4 -0.  4 . 137  ] 
converged  to 

= [ -0.  1 .08  . 02  ] 

in  only  six  iterations  and  the  efficacy  corresponding  to  this  score  vector  was  given 
by 

e = . 8735 

i.  e. , the  efficacy  deviation  Ae  = 1,07%. 

Example  2 

For  a Gaussian  shift  parameter  and  m=2,  it  is  known  that  = . 27, 

The  optimum  score  vector  is  given  by 

Spt.  -'0^3 

and  the  corresponding  maximum  efficacy  is 

c = .810 

max 

The  results  of  applying  the  stochastic  approximation  method  as  in  Example  1 are: 


and 


Al  = -4 


. 1 


A^  = .245 


, 5 


The  initial  deviation  vector 
K = [ -.  4 . 245  ] 
converged  to 

= [ -.0049  .0061] 

in  only  four  iterations  and  the  corresponding  efficacy  deviation 
A e = . 04% 

Example  3 

For  a Cauchy  shift  parameter,  it  is  known  that  for  m = 2 and 


'1  = .445 


Az  = -555 


AO-AObO  265  POLYTECHNIC  INST  OF  NEM  YORK  BROOKLYN  MICROWAVE  ReSE—ETC  F/G  20/3 
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The  optimum  score  vector  is  given  by 

C ^ = r . 707  . 707] 

and  the  corresponding  maximum  efficacy  is 

c = . 429 
max 

The  results  of  applying  the  stochastic  approximation  method  as  in  Examples  1 and  2 
are: 


Aj  = -.3 


. 1 


^2  = -206 


.5 


and  = 


The  initial  deviation  vector 
A = [ -.3  . 206] 

converged  to 

■2^  = [ 0.00  0.00] 
in  six  iterations  and  the  efficacy  deviation 
Ae  = 0% 

Example  4 

For  a Gaussian  shift  parameter,  it  is  known  that  for  m=  6,  and 

r = [.055  .190  .390  .952  .81  .945] 

The  optimum  score  vector  is  given  by 

C , =[.  501  . 371  . 333  . 333  . 371  . 501] 

opt.  ^ 

and  the  corresponding  maximum  efficacy  is 

€ = .956 

max 

The  results  of  applying  the  stochastic  approximation  method  using  the  developed 
computer  program  in  Appendix  D,  of  Reference  3,  are  as  follows 

■2=  [-.  501  -.  371  -.  333  -.  333  -.  371  +.  499  ] 


and 


"n  = 


. 1 


^n  = 


. 2 
n 


The  initial  deviation  vector  , converged  to 
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A = [ .2  -.04  -.01  -.04  -.02  -.07  ] 
in  1000  iterations  and  the  efficacy  deviation  is 
Ac  = .004 

Several  other  examples  were  solved  and  the  procedure  converged  in  all  the  cases 
tried.  However,  the  number  of  iterations  required  for  convergence  increased  as  the 
number  of  estimated  scores  increased.  A summary  of  the  solutions  for  various  ex- 
amples is  shown  in  Figs.  1 to  4 for  various  score  numbers. 

It  should  be  noted  that  this  procedure  is  inexpensive  in  terms  of  CPU  time. 

The  maximum  CPU  time  used  for  estimating  six  scores  simultaneously  was  2.  24  CPU 
sec.  to  attain  an  efficacy  deviation  of  . 4%. 

Joint  Services  Technical  Advisory  Committee 
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DATA  SMOOTHING  VIA  ORDER  STATISTICS 
S-G.  Tyan 

Whenever  a sequence  of  data  is  observed,  be  it  an  image  or  a speech  parameter 
process,  it  is  subject  to  "smoothing"  at  one  or  more  of  the  many  stages  of  signal 
processing.  The  purpose  is  to  correct  erratic  data  and  to  bring  out  the  underlying 
signal  in  a clearer  form. 

Usually  the  underlying  signal  is  thought  to  have  a certain  kind  of  "regularity.  " 
Thus,  if  the  observed  data  sequence  does  not  exhibit  "irregular  fluctuation"  then 
smoothing  is  hardly  necessary.  In  the  past,  with  rare  exceptions,  people  have  used 
low-pass  filters  to  carry  out  this  task  with  notable  success.  However,  in  image  signal 
processing,  the  story  seems  to  be  quite  different.  Here  the  edges  are  important  in- 
formation bearers  and  are  usually  abundant  in  many  cases.  Therefore,  a low-pass 
filter  which  smears  the  jumps  that  always  accompany  the  edges  cannot  accomplish  the 
task  it  is  supposed  to  do. 

Instead  of  using  the  usual  frequency  contents  of  a signal  as  a measure  of  its 
"smoothness*  or  "regularity,"  we  propose  to  use  the  supportedness  of  a datum  by  its 
neighbors  which  is  to  be  defined  later.  Before  we  introduce  the  upper  envelope  and  the 
lower  envelope  of  a data  sequence,  we  shall  offer  a brief  argument  which  goes  like 
this;  Let  be  a data  sequence  and  let  be  its  smoothed  version.  Let  k be  a 

fixed  positive  integer.  Suppose  51^  is  to  be  determined  from  its  k successive 
neighbors  and  x^  itself.  For  example,  we  want  to  determine  x^  from 


n-k+i 


where  i is  an  integer,  0 <_  i <_  k . Since  we  lack  any  idea  about  the  nature  of  the 
data  sequence,  we  may  assume  that  x^  lies  in  an  interval  bounded  by  the  maximum 
and  the  minimum  of  i.  e. , 


*n+i^'  i.e-. 


n-k+i’ 


X . . } < S < max  ,,.,...,x..}. 
n+i  — n—  “^n-k+i  . * n + i 


Since  i can  be  any  integer  from  0 to  k,  thus 


max  min 
0 < i k 


in{x  .. 

^ n + i- 


■ ...,  X ,.J  ^ 

; n+i"  — n — 


0 < i k 


n+  i-k' 


X . .}  . 

n+i 


Definition!  Let  k be  a fixed  positive  integer,  and  let  n < x , be  a 

sequence.  The  two  sequences  {x^^}  and  upper  and  lower  envelopes  of 

{x^}  , respeceively,  where 


min  max  ix 
0 < i <.  k ' 


n+i-k’ 


X .} 

n+i 
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and 

X = max  i,.  •••.  x .}  . 

— n n ^ • 1 n+i-K  n + i 

0 < 1 k 

Remark:  The  ^ } and  {x  } are  simply  the  local  minmax  and  the  local  maxmin. 

' n 

Obviously  {x  } < {x  } < Gt  }•  Let  I be  the  closed  interval  [x  , x ].  Then  we 

shall  say  that  x^,  the  datum,  is  well- supported  by  its  neighbors  (with  indices  spanning 

from  n-k  to  n + k)  if  jx^-x^l  is  small  and  that  it  is  not  if  Ix^-x^l  is  large.  What  is 

interesting  about  this  definition  is  that  if  a datum  is  supported  by  its  neighbors  on 

one  side,  say.  ^ j . ....  (i.  c.  , they  all  lie  in  the  proximity  of  x^) 

then  x^  according  to  the  definition  is  well- supported  b>  its  nelshoors.  This  property 

makes  jump-type  discontinuities  almost  harmless  if  one  remembers  how  horrible  they 

are  in  Fourier  analysis.  If  1 = [x  . x J . which  is  sort  of  an  eye,  closes  for  all  n, 

* n ““n  n 

i.  e.  . X = X . then  we  have  the  following  theorem. 

— n n 

Theorem  t:  The  upper  envelope  and  the  lower  envelope  coincide  with  each  other 
iff{)c^}  is  a locally  monotonic  sequence  of  length  k+  2. 

A sequence  is  locally  monotonic  of  length  m (abbr.  LOMO(m))iff(x^,  , ..., 

X . . ) is  monotonic  for  all  n. 

n+m-1 

If  the  upper  and  lower  envelope  of  {x^}  do  not  coincide  or  they  are  not  even  close 
to  each  other  for  all  n,  then  the  need  arises  of  smoothing  the  data.  It  seems  advisable 
to  consider  smoothers  which  belong  to  the  / class. 

Definition;  A smoother  T is  aaid  to  belong  to  JC  iff  the  smoothed  version,  is 

in  between  the  upper  and  lower  envelopes  of  . In  other  words,  we  think  that  the 

undisturbed  signal  lies  in  between  {x^^}  and  . 

Theorem  2.  JC  is  closed  under  convex  combination  and  compounding,  i.  e.  . if  Tj  and 
T^  are  in  t , then  aTj  + (1  -a)1  0 < o ^ 1 , and  Tj  a.  T ^ ^ ^ • 

Examples  of  smoothers  in  jC.  are  numerous  and  some  of  them  have  been  applied  to 
speech  processing^  and  image  processing  ’ with  success.  The  most  important  of 
them  are  possibly  the  family  of  running  medians.  A running  median  smoother  of 
odd  length  2p  + 1 (abbr.  RM(2p  + l))is  defined  as 

RM(2p+l){x^}  = . 

where  v = median  {x  . ...,x  for  all  n.  It  is  easy  to  check  that  RM(2p+ 1 ) 

n ^ p T p 

belongs  to  £ for  all  0 < p < k . From  Theorem  1 and  Theorem  2 we  know  that  a 
LOMO(k+2)  sequence  is  always  invariant  under  any  smoother  made  up  of  RM(2p  + l), 

0 < p < k,  through  convex  combination  and  compounding.  In  fact,  it  can  be  shown  that 
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many  of  these  smoothers  have  the  power  to  drive  an  arbitrary  sequence  toward  a 
L.OMO(k-<-2)  sequence  vdiile  the  jumps  inherent  in  the  signal  sequence  are  preserved. 

If  we  let  E and  E be  two  operators  such  that  E ^ } s ^ } and  E {x  } s{x  } , 

— ^ n n — "^n  ~n 

respectively,  then  we  have  the  following  interesting  theorem: 

Theorem  3.  Both  EE  EE  {x^}  are  locally  monotonic  sequences  of  length 

k-t-2.  Furthermore:  if  {s^}  is  a LOMO(k  + 2)  sequence  which  lies  in  between  the  two 
envelopes  »nd  . then 

EE{x  > < {s J < E E{ic  } . 

n • n n 

Remark:  EE  ilte  largest  LOMO(k-t-2)  sequence  which  lies  below  and 

E^  {^0^  smallest  which  is  above  {x^}  . 

Theorem  4.  E E & ) < RM(2k  + l){x  } <EE{x  } . 

Extensions  of  the  above  definitions  and  theories  to  arrays  of  data  are  currently 
under  study.  We  are  also  examining  the  interplay  between  inverse  filtering  and  non- 
linear smoothing  with  application  to  image  restoration. 
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THE  NYQUIST  PRORLEM  IN  THE  PRESENCE  OF  NON  LINEARITIES  IN  THE 
DUOBINARY  SYSTEM 

L.  Kurz  and  M.  Wernicki 

In  the  previous  work  by  the  authors  the  effect  of  memoryless  nonlinearities  on 
intersymbol  interference  in  a duobinary  PAM  system  was  considered.  It  was  demon- 
strated that  these  nonlinearities,  natural  to  each  system  or  introduced  to  suppress 
impulsive  noise,  create  problems  in  controlling  intersymbol  interference.  In  a par- 
allel research  effort,  a frequency  domain  approach  to  equalization  in  standard  PAM 
systems  in  the  presence  of  memoryless  nonlinearity  represented  by  a cubic  poly- 
nomial was  dt!vcloped.  The  main  purpose  of  this  report  is  to  extend  the  results  of 
Ref.  i to  duobinary  systems. 

A.  Design  in  the  Frequency  Domain 

Let  the  linear  part  of  the  signal  have  a spectrum  C;^(w)  (desirable  part),  and  the 
cubic  te rni  have  a spectrum  Gg(w)  (undesirable  part).  Using  Nyquist's  criterion,  the 
constraint  equations  can  be  written  as 

!^G^"^w)G^"\w)  =Tx^  |w|  i ^ 

n 

and 

yG^"^w)G^"^w)  = 0 |w|  ^ ^ (2) 

n 


The  minimization  will  be  performed  with  respect  to  the  variance  of  the  noise  (gaussian 
noise) 


TT 


T 

Thus,  for  each  value  of  w.  |w|  < ^ • of'e  desires  the  stationary  point  of 


(3) 


J = V {G^”^w)|^+  k(w)G|"\w)G^">(w)  + y(w)G^''\w)G^"\w)}  (4) 

n 

which  is  found  using  standard  techniques  of  the  calculus  of  variations,  obtaining 

G^"Nw)  = -I \(w)[G^"’*(w)]  -iY(w)[G^"^"(w)]  (5) 


where 


denotes  the  complex  conjugate  operation. 
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Substituting  Eq.  (5)  into  Eqs.  (1)  and  (2).  yields 


E {-H(w)1g|'"\w)|  - iv(w)lG|">(w)|[G^”^w)]*}  =Tx^ 

n 

2 {-H(w)[G<”^w)]*G^'"Nw)  - iv(w)lG^'">{w)l^}  =0 


Solving  Eq.  (7)  for  y (w)  results  in 


Y (w)  = 


^\(w)[g{"\w)]*G^''\w) 
n 

XiG<“>(w)|^ 


Using  Eqs.  (8)  and  (6),  and  solving  for  yields 


\ (w)  = 


gI'^^w) 


X 

2J|g^^*(w)|^ 


] Gi">(w)[G4"\w)]*} 


Similarly, 


Y (w)  = 2 


J|gW(w)1 

2jCGi‘’(w)i'oi‘’(») 


1 rnl  G^">(w)[G^">(w)]* 

] |g|”>(w)|  - -2 


The  desired  filter  characteristic  is  obtained  by  substituting  Eqs.  (9)  and  (10)  into 
Eq.  (5),  obtaining 


G^"^w)  = 


TXo[G^'')(w)]* 


Z 1g 


|;[g;^'(w)]  G^^'lw) 


J |G|'‘^w)r} 


*4 


The  number  of  terms  for  which  intersymbol  interference  can  be  eliminated  depends 
upon  the  available  bandwidth.  The  cubic  term  produces  power  terms  and  cross- 
product  terms  with  three  times  the  original  signal  bandwidth.  This  extra  bandwidth 
introduces  the  additional  degrees  of  freedom  in  the  receiver  design,  allowing  the 
additional  constraints  to  be  satisfied. 


B.  Application  of  the  Design  Procedure  to  Duobinary  Signals 

An  example  will  help  to  clarify  the  use  of  this  filter  design  procedure.  Speci- 
fically, consider  the  duobinary  signal  given  by 

S(w)  = 2T  cos  ( yw)  |w  I < ^ 

= 0 otherwise 

It  may  be  shown  that  such  signal  satisfies  Nyquist's  criterion  for  eliminating  the 
intersymbol  interference.  However,  as  with  an  ordinary  PAM  system,  it  is  advanta- 
geous to  split  the  overall  duobinary  signal  shaping  equally  between  the  transmitter  and 
receiver. 

The  frequency  response  due  to  the  linear  portion  of  nonlinearity  is  shown  in 
Fig.  1.  curve  A.  The  cubic  term  of  the  nonlinearity  was  calculated  using  IBM  370 
computer  and  its  frequency  response  is  shown  in  Fig.  I,  curve  B.  The  frequency  response 

corresponding  to  Eq.  (II)  is  shown  in  Fig.  1.  curve  C. 

<n 
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The  same  constraints  as  in  Ref.  2.  an  equalizer  (filter)  with  transfer  character- 
istics satisfying  these  constraints  may  be  designed.  The  time  response  of  this  equal- 
izer, as  shown  in  Fig.  2,  is  not  well  damped,  increasing  the  jitter  problem.  The 
frequency  response,  shown  in  Fig.  1,  reveals  poor  suppression  properties  of  the 
equalizer. 


Fig.  2.  Output  time  responses  due  to  linear  and 
nonlinear  terms. 

Faced  with  these  difficulties,  it  is  desirable  to  seek  a better  equalizer  which 
may  allow  small  intersymbol  interference  due  to  the  linear  and  cubic  terms  yet,  at  the 
same  time,  reduce  the  jitter  and  noise  problem.  This  approach  is  preferable  because 
it  results  in  better  overall  system  performance. 

To  this  end,  the  following  changes  need  be  made  in  the  constraint  Eqs.  (1) 
and  (2): 


2g|")(w)G^'')(w)=Tx^  , 

|wl  < ^ 

(12) 

n 

^G^">(w)g4'^»(w)  = TXo 
n 

|w|  ^ ^ 

(13) 

Following  the  procedure  leading  to  Eq.  (11),  one  obtains  the  transfer  character- 
istic of  a modified  equalizer  in  the  form 
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{Tx  -Tx'  5 } [G 


Gj{"^  (w)  = 

comp 


O O y.v 

1 


^'dGl^^w)!  - f -i 


Z [g/>(w)J''g^">(w) 


E |G^^>(w)i 

i 


] Gj">(w)[G^”*(w)]’'} 


,,  , 

4 t^A  ('^>1  (w) 

2 EIg^^^w)!^ 

i:(|c^"'(w,|  [G^»'(w,]''  - 

i 


]|g|''\w)|  } 


The  frequency  response  of  the  modified  equalizer  is  shown  in  Figure  3.  The  new 
equalizer  has  a superior  noise  suppression  capabilities  when  compared  with  the 
equalizer  of  Figure  1. 


Fig.  3.  Frequency  response  of  the  filter  with  inter- 
symbol term  remaining. 
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Since  the  time  response  of  the  new  equalizer  is  well  damped  (see  Fig.  4).  it  has 
good  jitter  immunity.  The  small  disadvantage  of  a tolerable  intersymbol  interference 
which  is  a function  of  the  nonlinearity,  the  noise,  and  the  signal  amplitude,  is  a small 
price  to  pay  for  an  overall  improvement  in  system  performance. 


Fig.  4.  Output  time  responses  for  filter  with 
intersymbol  interference  remaining. 
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A NEW  APPROACH  TO  THE  ALGEBRAIC  THEORY  OF  AUTOMATA 
T.  T.  Lee  and  E.  J.  Smith 

The  objective  of  this  research  is  to  develop  a new  approach  to  the  theory  of 
automata  which  will  provide  the  formalism  necessary  to  encompass  the  concepts  of  the 
algebra  of  regular  expressions  and  the  algebraic  structure  theory  of  automata.  More 
specifically,  the  approach  involves  the  construction  of  a mathematical  model  of 
automata  based  upon  the  algebraic  theory  of  fields  in  which  every  regular  set  R is 
represented  by  an  element  q of  sonoe  abstract  field  K,  such  that  the  automation  which 
accepts  the  regular  set  is  represented  by  a state  equation  H(z),  a certain  polynomial 
in  the  indeterminate  variable  z over  the  field  K,  having  a root  a.  Preliminary  work 
has  revealed  perhaps  surprisingly,  that  H(z)  has  properties  in  common  with  those  of 
differential  and  difference  equations  in  the  real  domain;  the  set  of  roots  of  H(z)  forms 
an  affine  space,  suggesting  an  analogy  between  the  structure  of  automata  and  that  of 
linear  systems. 

In  the  following  sections  of  this  report,  the  preliminary  work  and  results  obtained 
are  presented  and  the  proposed  future  work  is  outlined. 

A.  Basic  Ideas  and  Preliminary  Work 

We  begin  by  introducing  some  symbols  and  terminology  which  for  the  most  part 
follow  standard  usage. 

(1)  = {0,l}.  the  input  alphabet  of  an  automaton  having  two  input  symbols 

(2)  Yj"  ~ {^.0.  the  set  of  words  generated  by  Y under 

concatentation 

(3)  g : 2_,  ■*  N a bijective  mapping  from  L into  N.  the  set  of  nonnegative 

integers  defined  recur siv^y  by 

(i)  = 0 

(ii)  If  K (w)  = n.  then 

|ji  (u,  0)  = 2n  + 1 and  p (u  1 ) = 2 for  all  a>  € ^ 

(4)  F.  the  prime  field  of  characteristic  2,  i.  e.  , GF  (2) 

X 

(5)  Ffx]  = {f(x)|f(x)  = Y V 

n=-x 

the  extension  of  the  formal  power  series  in  the  indeterminate 
variable  x over  F 

F[x] '*'»  {g(x)  |g(x)  = Y 
n=0 

the  ring  of  the  formal  power  serves  (polynomial)  in  x over  F 


(6) 
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V * r V* 

(7)  power  «et  of  ^ 

We  now  define  a mapping  which  provide*  the  link  between  a regular  aet  R.  and 
a formal  power  series  in  x. 

A 

A 

For  any  subset  Lf  Y* . there  is  an  associated  formal  power  series  f^(x)  defined  by 

»(L)  = f (x)  = X 
^ «cL 

Conversely,  for  any  power  series 

f{x)  = ^ a^x" 
n=0 


there  is  a subset  R-  ^ w e R iff 

The  mapping.  *,  is  bijective  and  posseses  the  following  properties; 


(i) 

(ii) 


4 (2^*)  = f ^ (x)  = 1 + X + x^  + 

Lt 


1+x 


4 (A  w B)  = 4(A)  + <t>(B) , provided  that  A ^ B = <() 


(iii)  4 (A  B)  = 4(A)  © 4 (B),  where 

4 (A)  O 4 (B)  denotes  the  Hadamand  product  of  4(A)  and  4 (B) 


i.  e. , if  f^(x)  = 21  ^ 

f.  T,  = Za  b x" 

A B ^ n n 


(iv)  4(<|))  = 0,  where  4>  is  the  empty  set 

4(L)  = 4(  Z*)  - *(L)  = * 

where  L is  the  complement  of  L. 


(v) 
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Considering  F[x]  as  a vector  space  over  F,  we  now  introduce  a linear 
operator,  D, 

D:  F[x]  -*  F[x]  . 

such  that  Df(x)  = f^(x)  for  all  f (x)  c F[x],  where  f^(x)  = ff(x)]^  . It  follows  that, 
given  RC  *(R)  = e F [x]  , RO  = {oj  0 | w e R } , and  R1  = {a>  1 | « e R } . then 

*(R0)  = ^ * = xDfj^  = xf^ 

weR 

The  use  of  the  operator,  D,  will  be  illustrated  in  later  examples.  However,  we  first 
note  for  reference  some  terminology  describing  several  special  classes  of  polynomials. 

(1)  A polynomial  L(z)  over  F[x]  is  called  a linearized  polynomial  iff 

L(z)  = ^ a.z^  = ( ^ a^D^)z 
i=0  ^ i=0 

A polynomial  L(z)  is  a linearized  polynomial  iff  its  roots  form  a vector 
space  over  F. 

(2)  A polynomial  A(z)  over  F[x]  is  called  an  affine  polynomial  iff 
A(z)  = L(z)  + i 

where  L(z)  is  a linearized  polynomial  and  4 ^ F[x]. 

A polynomial  A(z)  is  an  affine  polynomial  iff  its  roots  form  an  affine  space 
over  F,  i.  e.  , if  aj  , are  roots,  then 

^ *^k^ 

The  following  analogies  are  suggested: 

(i)  Linearized  Polynomial  < > Homogeneous  Differential  Equation 

(ii)  Affine  Polynomial  < > Non-Homogeneous  Differential  Equation. 

B.  Applications  to  Finite  Automata 

We  now  illustrate  by  examples  the  application  of  some  of  the  previous  concepts 
to  finite  state  machines. 
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Let  Rj  and  R^  be  the  sets  of  words  (i.  e.  . strings)  accepted  by  states  A and  B, 
respectively,  of  the  given  automaton.  Then, 

R,  =R,0+R0  + \ 

11a 

R^  = Rjl  + R2I 


Our  objective  is  to  find  the  formal  power  series  (i.  e. , polynomial)  representations 
corresponding  to  Rj  and  R2.  Let  *(Rj)  * f and  *(R2)  = g!  then,  using  the  D-operator 
notation, 

f = xDf  + xDg  + 1 

g = x^Df  + x^Dg 
or  in  matrix  form 


f 
g 

It  follows  that 


2 2 
x X 


Df 

Dg 


f = xf^  + xg^  + 1 

2,2  . 22 

g = X f + X g 

By  simple  algebra,  we  find  that  f and  g satisfy  the  affine  polynomials,  H(f)  and 
H(g),  respectively. 

(x^  +x)  f^  + f + (x^  + 1 ) = < ^ H(f  ) =0 

(x^  + 1)  g^  + g + x^  = < > H{g)  = 0 

After  rearragement,  we  have  the  factored  forms. 

{(x^  + x)  f + (x^  + X + 1 )}  ' {xf  + (x  + 1 )}  =0 

{(x^+l)g+  x^  } • {g  + 1}  = 0 . 

Solving  for  f and  g , 


f X -1  X + 1 _ . X 

* “ 2 - I r 2 

x'^  + 1 1 + X 


X + 1 


g = 


1 + X 


1 
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We  reject  the  second  roots  for  f and  g , since, 
f ^ ^ ^ ^ = 1 + x‘‘  , i.  e.  . x'‘  / F [x]'*’ 

and 

g ^ 1,  V / R^  . 

Thus  for  the  given  automaton  we  have  characterized  the  regular  expressions  Rj  and 
R^  by  algebraic  expressions  f and  g,  respectively.  To  illustrate  further,  let  us 
begin  with  Rj  and  R^  which  can  be  found  by  conventional  methods  to  be 

1 • 

Z*  1 

) = i-y  ^ . 

Therefore , 


®(Rj)  = f = 1 + xD  * (2^*)  = 1 + xD 


1 

1 +x 


1 + 


1 +x‘ 


^(R^)  = f = x^D*(^  *)  = x^D  yT 


1 +x‘ 


which  checks  our  previous  solution  for  f and  g.  Thus  for  any  deterministic  auto- 
maton we  can  construct  "state  equations,"  i.  e.  , H(f)  and  H(g),  whose  roots  are  the 
formal  power  series  corresponding  to  regular  expressions.  The  analogy  with  linear 
dynamic  systems  is  evident. 

The  affine  polynomial  corresponding  to  an  automaton  may  or  may  not  be  factor- 
able; in  the  latter  case,  the  formal  power  series  representation,  f(x)  of  the  corre- 
sponding regular  set,  R,  cannot  be  represented  as  a rational  fraction  of  the  form 
p(x)/g{x).  where  p(x)  and  g(x)  are  polynomials  in  x.  Thus,  a given  regular  set  may 
or  may  not  be  describable  by  a closed-form  algebraic  expression  in  x.  It  is  con- 
jectured that  a formal  power  series  f(x)  which  is  describable  by  a rational  fraction  of 
the  form  p(x)/g(x)  represents  a regular  set  R,  under  the  mapping,  *(R). 

There  is  the  need  for  the  development  of  additional  techniques  for  manipulating 
formal  power  series  expressions.  We  can  generalize  the  previous  equations  for 
*R0)  = xD*{R)  and  *(R1)  = x^D*(R)  as  follows:  *{Ru ) = x'^“  ^ «(R),  where  |(o| 

denoted  the  length  of  u . 
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Example  2 

«(010)  =xD*(R01)  =xDx^D«(R0) 

= xDx^DxD  «(R)  = x^D^xD  9(R) 

9 3 

= x^D  «(R) 

where  h(010)  = 9 ^ 

The  result  follows  directly  from  the  formula  for  *(Rto). 

The  formal  power  series  of  a given  regular  set  satisfies  some  affine  polynomial, 
and  the  polynomial  can  be  constructed  from  the  state  diagram  of  the  corresponding 
automaton.  Given  the  regular  expression  for  some  regular  set,  one  can  construct  the 
affine  polynomial  for  the  corresponding  power  series.  The  following  two  examples 
provide  further  illustrations  of  the  application  of  the  D-operator. 

Example  3 

Given  R = 1*  , let  *(R)  = f{x)  . 

Then,  f(x)  = 1 + x^Dl  + x^Dx^Dl  + . . . 

= I + x^D  + (x^D)^  ... 

1 

1 +x^D 

Hence,  (1  + x^D)  f = 1 
Therefore,  x^f^  + f + 1 =0 

The  state  diagram  of  the  automaton  which  accepts  1*  3 given  below. 

starting  with  the  implicit  equation  for  R^. 

Ra-  Ra  ‘ 

and  letting  *(Ra)  “ obtain, 

f = x^Df  + 1 = x^f^  + 1 


I 
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Hence, 


X V + f + 1 =0 


which  agrees  with  the  previous  result. 

In  this  example,  the  affine  polynomial  is  irreducible  and  hence  f{x)  cannot  be 
given  in  closed  form. 

Example  4 


Consider  the  identity.  (01)"*"  = 0(10)*  1.  First  we  let  R = (01)"*"  and  ®(R)  = f(x). 


For  f = *(01)'^).  (JL(Ol)  = 4 and  |01 1 =2. 


4 424  4224 

Then,  f(x)  = x + (x  D )x  + (x  D ) x + 


1 + x^D^ 


which  yields,  f + x^D^f  = x^ 


or 


4 4 4 

X f + f + X = 0 


Alternatively,  let  R = 0(10)*  1 and  ^(R)  = f(x).  Now  M-(IO)  = 5 and  | loj  =2. 
Proceeding  algebraically  as  before,  one  obtains. 


f = x^D{l  + X 


Continuing , 


D'^x'^  = (1  +x^D^)'^  x 


52  -1-2 

(1  +x'’d^)D  ^x  = X 


rv-1  -2^^  5^  -2^ 

D X f+xDx  f=x 


T^-l  -2.^  r2 

D X f + xf  s X 


X ^f  + dxf^  = Dx 


-2  . 2,4  2 

X + X f = X 


4 4 4 

X f + f + X = 0 


Finally, 

The  same  result  obtains  as  before.  Equivalent  regular  expressions  define  the  same 
affine  polynomial 

Formal  rules  for  the  manipulation  of  operator  polynomials  have  yet  to  be 
developed. 


*4 


COMPUTERS  AND  COMPUTER-COMMUNICATION  NETWORKS 


371 


C,  Concepts  Related  to  Structural  Properties  of  Automata 

In  the  remainder  of  this  section  we  illustrate  some  preliminary  ideas  which  are 
related  to  the  structural  properties  of  automata. 

Example  5 

Consider  the  six-state  finite  automaton,  L,  described  by  the  state  table  and  set 
of  regular  equations  shown  below. 


L 

0 

1 

1 

2 

3 

= ^2° 

+ 

R5O 

+ R^O 

2 

1 

3 

^2 

+ RjO 

+ 

R4I 

3 

4 

5 

^3 

+ Rjl 

+ 

^2‘ 

+ R4O 

4 

3 

2 

^4 

+ R3O 

5 

1 

6 

^5 

= ^1 

+ 

^6‘ 

6 

1 

5 

^6 

Let  = «(R.)  e F [x]  for  i = 1 . 2, 6 

Then,  from  previous  work. 


^1 

0 

X 

0 

0 

X 

X 

DZi 

^2 

X 

0 

0 

x^ 

0 

0 

DZ2 

^3 

x2 

x^ 

0 

X 

0 

0 

DZ3 

^4 

0 

0 

X 

0 

0 

0 

DZ  . 

4 

^5 

0 

0 

x^ 

0 

0 

x2 

DZ3 

1 

N 

0^ 

0 

0 

0 

0 

x^ 

0 

DZ^ 

In  general,  the  machine  is  characterized  by  the  algebraic  expression 
Z = A D Z . 

where  Z c F F[x]^  denoted  the  six-fold  Cartesian  product  of  F[x],  and  A 

is  a 6x6  matrix  over  F[x].  The  characteristic  function  of  matrix  A is  readily  f 
found  to  be 

f^(\  ) = [ X.  + x]  [ X.  + x^]  [ X + (x+x^)]  [ X^  + (x^+x^)  X + (x^+x^)] 
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It  has  been  found  that  if  tt  is  a partition  with  the  substitution  property  on  the  set 
of  states  of  machine  L.  and  L is  the  image  machine  with  the  state  equation 
Z'  = A' D Z',  and  is  the  characteristic  function  of  matrix  A',  then  is  a 

divisor  of  f^(  \ ). 

For  the  given  machine,  L.  we  observe  that  the  following  partition  posses  the 
substitution  property 

n = { TTz  . T;  4:  T7^ } 

and,  accordingly,  that  given  below  is  an  image  machine. 


L 

TT 

0 

1 

where 

'll 

'll 

'Iz 

'll  = 

1 . Z 

'I3 

‘14 

‘iz  = 

T 

'iz 

'll 

'I3  = 

T 

‘’4  i 

'll 

'14 

'14  = 

TT 

The  matrix  A'  associated  with  machine  L is  found  to  be 

IT 

X 0 X 

X^  0 X 0 

0X00 
0 X^  0 x^ 

and  the  characteristic  function  of  A'  is  given  by 

= [ ^ + (x+  x^)]  [ + (x^  + x^)  + (x'^  + x^)] 

which,  by  inspection,  is  seen  to  be  a factor  of  ^ ^ given  machine 

L is  irreducible,  the  machine  does  not  possess  an  image  machine.  Thus  we  have  a 
link  between  a structural  property  of  a finite  automaton  and  our  algebraic  character- 
ization of  the  machine.  A question  to  be  explored  further  is  the  following:  if 
is  a divisor  of  does  this  imply  that  the  state  space  of  machine  is  an  in- 

variant subspace  of  the  machine  L?  The  relationship  between  the  algebraic  repre- 
sentation and  other  structural  properties  will  be  explored  as  well. 

Another  approach  will  be  based  upon  the  introduction  of  a certain  linear  trans- 
formation. For  each  partition  it  on  S = {S.  ,S S } = {S.},  the  set  of  states  of 

1 a.  n 1 

machine  L,  there  is  a representation  function,® 


1 
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such  that, 

S.  E S.(Tr)  iff  r(i)  = r(j) 

where  the  function  r satisfies  the  following  criteria: 

(1)  r(i)  <_  i.  for  all  i 

(2)  r • r{i)  = r(i),  for  all  i. 

We  now  define  the  n x n matrix,  T,  induced  by  the  representation  function,  r, 
such  that 


T = (Pj.) 


1.  if  f(j)  = i 
0.  if  f(j)  / i 


We  can  show  that  T = T.  as  follows: 


suppose  T = = 


(l, 


we  must  demonstrate  that 


ij  Pik^kj  " Pij 

For  each  column  of  T,  there  is  one  and  only  one  element  equal  to  1;  all  other 
elements  equal  0.  Therefore,  if  mj^j  = 1,  there  is  some  f for  1 ^ f ^ n,  such 
that  = Pjjj  = 1.  Then  by  definition,  r(f)  = i and  r(j)  = I,  which  implies  r*  r(j)  = 
r(f ) = i.  By  the  idempotent  property  of  r,  r(j)  = i and  thus  Pj^^  = 1.  Conversly,  if 
p^^  = 1,  then  r(j)  = i;  then,  r(i)  = r-  r(j)  = r(j)  = i and  p^j^  = 1.  It  follows  that 

“ 2 
y p.,  p,  . = p..p.-  = p- • = m...  Therefore,  T = T. 

k4l  ^ U U U 


The  transformation  matrix  T induced  by  previously  given  SP  partition  w on 
the  set  of  states  of  the  machine  in  Example  5 is  as  follows: 


T = 


1 

0 

0 

0 

0 

0 


0 

0 

1 

0 

0 

0 


0 

0 

0 

1 

0 

0 


0 

0 

0 

0 

1 

0 


It  is  easy  to  verify  that  T^  * T. 
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We  know*^  that  a linear  transformation,  T,  on  a vector  space  V is  a projection 
iff  T^  = T.  It  follows  that  the  transformation,  T.  induced  by  the  partition  ir  having 
the  substitution  is  a projection  over  the  state  space  F[x]”.  The  properties  possessed 
by  such  a transformation  will  be  investigated.  In  general,  the  use  of  the  T matrix  as 
a tool  for  exploring  structural  properties  of  machines  will  be  pursued. 

D.  Proposed  Future  Research 

In  accordance  with  the  general  approach  for  the  development  of  a unified  algebraic 
theory  of  automata  outlined  in  the  previous  section,  the  proposed  program  will  be 
pursued  with  the  following  objectives: 

(1)  To  obtain  a well-defined,  unique  representation  of  a regular  set  as  an 
element  of  an  abstract  field  K 

(2)  To  develop  procedures  for  setting  up  the  state  equation  over  K for  a given 
automaton,  such  that  the  solution  of  the  state  equation  is  the  representation 
of  the  regular  set  accepted  by  the  automaton 

(3)  To  analyze  the  properties  of  the  affine  space  formed  by  the  roots  of  the  state 
equation  of  the  automaton 

(4)  To  investigate  the  structure  and  general  criteria  for  decomposition  of  an 
automaton  from  the  state  equation  associated  with  it 

(5)  To  investigate  the  analogies  between  automata  and  linear  systems,  based 
upon  the  abstract  field  representation  of  the  former. 

In  general,  the  goal  is  to  develop  a comprehensive  algebraic  theory  of  automata 
based  upon  the  algebraic  theory  of  fields. 
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A new  hardware  description  language  referred  to  as  MDSL  (Microcomputer  Design 
and  Simulation  Language)  is  described  and  its  application  to  an  illustrative  example  con- 
sisting of  an  array  of  identical  processors  operating  asynchronously  is  shown.  The 
advantages  of  such  a language  are  severalfold:  (a)  it  serves  as  a tool  for  studying  the 
performance  of  various  computer  architectures,  (b)  it  provides  a well-defined  notation 
for  describing  a particular  architecture,  (c)  it  permits  the  quantitative  comparison  of 
alternative  system  designs  by  writing  test  programs  to  be  run  on  simulated  machines, 

(d)  it  enables  designers  to  investigate  alternative  designs  and  to  evaluate  various  trade- 
offs before  actually  constructing  the  computer. 


MDSL  as  a particular  hardware  description  language  was  designed  to  be  a highly 
structured,  modular,  register  transfer  type  language  that  incorporates  the  control- 
structure  characteristic  of  a microprogrammable  system,  such  as  the  one  proposed  by 
Wilkes  in  the  early  I950's.  As  a consequence  of  the  modular  structure  of  MDSL  and 
the  decision  to  write  it  in  PL/l,  it  was  possible  to  implement  quickly  a basic  subset  of 
the  full  language  for  immediate  use  and  to  then  gradually  incorporate  the  more  advanced 
features.  The  current  version  of  MDSL  (Version  2.00)  allows,  among  other  things,  the 
capability  of  working  with  multiple  copies  of  processors  connected  together  in  a network 
and  with  various  types  of  content-addressable  parallel  processors. 


A partial  description  of  the  language  and  its  options  is  given  in  the  present  report. 
A more  complete  guide  to  the  MDSL  language  is  provided  in  Refs.  2 and  3,  along  with  a 

number  of  examples  showing  its  use  in  modelling  various  computers  and  computer-like 
forms . 


B.  MDSL  Overview 

The  five  basic  components  of  the  classical  digital  computer:  input,  output,  mem- 
ory, arithmetic  and  logic,  and  control  are  represented  in  Figure  1.  These  components 
communicate  by  means  of  signals  that  contain  data,  instructions,  or  control  information. 
The  control  unit  controls  the  sequencing  and  direction  of  these  signal  flows  and  there- 
fore determines  how  the  computer  operates.  Wilkes^  proposed  a model  of  micropro- 
gramming in  order  to  systematize  computer  description  by  providing  an  orderly  approach 
to  the  description  of  the  control  unit.  Figure  2 contains  a simplifit  d representation  of 
the  Wilkes  model.  In  this  figure  the  horizontal  lines  are  called  mi.  roinstructions;  the 
four  major  components  of  the  model  are  as  follows: 
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^ INFORMATION  OR  INSTRUCTION  FLOW 
-♦  CONTROL  SIGNAL  FLOWS 


Fig.  I.  Functional  components  of  a modern  digital  computer. 


From 

Conditional 
Flip  Flops 


(1)  Control  Line  Matrix,  The  output  (vertical  lines)  of  this  matrix  control  the 
flow  of  data,  instructions,  and  special  control  signals  throughout  the  com- 
puter. For  example,  vertical  line  might  cause  the  program  counter  to 
be  incremented.  Each  vertical  line  corresponds  to  a microoperation. 

(2)  Next  Microinstruction  Matrix.  The  output  (vertical  lines)  of  this  matrix 
forms  the  binary  address  or  label  of  the  next  microinstruction. 

(3)  Lines  From  Conditional  Flip  Flops.  These  inputs  (vertical  lines)  contain 
information  about  the  conditions  and  states  of  various  parts  of  the  computer. 

(4)  Decoding  Tree.  The  inputs  come  from  the  Next  Microinstruction  Matrix 
and  the  output  is  a signal  on  one  of  the  horizontal  lines,  the  next  microin- 
struction to  be  performed. 
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In  Fig.  2 a dot  at  the  intersection  of  a horizontal  line  and  a control  line  means  that 
the  indicated  microcperation  will  be  performed  as  part  of  that  microinstruction.  The 
address  of  the  next  inicroinstruction  is  found  by  interpreting  dots  as  I's  and  no  dots  as 
O's  in  the  Next  Microinstruction  Matrix  and  then  interpreting  the  I's  and  O's  as  a binary 
number.  Note  that  the  lines  from  the  conditional  flip  flops  cause  some  horizontal  lines 
to  be  split  into  two  lines  before  entering  the  Next  Microinstruction  Matrix.  Thus,  de- 
pending on  the  condition,  one  of  two  possible  microinstructions  will  follow.  Consequent- 
ly, each  microinstruction  has  a set  of  microoperations,  a possible  test,  and  one  or  two 
possible  microinstructions  to  be  performed  next. 

MDSL  is  organized  as  a set  of  relatively  independent  sections  whose  functions 
follow  the  Wilkes  model.  Each  section  begins  with  a control  card  which  contains  a $ in 
column  one  followed  by  one  of  the  nine  control  section  names. 

$ JOB,  name 

This,  the  first  card  in  an  MDSL  job,  causes  a header  page  to  be  printed  and 
all  tables  to  be  initialized. 

$DEF 

All  definitions  of  information  storage  components:  registers,  flags,  and 
memories  follow  this  card. 

$ STATIC 

This  card  precedes  the  definitions  of  the  microoperations . This  is  where 
the  controllable  data  flows  are  defined  and  given  labels. 

$ MICRO 

The  "microprogram"  of  the  computer- type  device  being  defined  is  given 
following  this  control  card.  This  corresponds  to  filling  in  the  dots  in  the 
Control  Line  and  Next  Microinstruction  matrices. 

$ INITIAL 

After  this  the  initial  values  of  the  registers  and  flags  are  given. 

$ PROGRAM 

After  this  the  initial  values  of  various  memory  locations  are  given. 

$RUN,  TRACE(i,  j,k,  • • •)(R1,R2,R3,  • • • ),HIST 

This  causes  the  previously  defined  machine  to  be  simulated,  microinstruc- 
tion by  microinstruction  with  a trace  that  will  cause  the  contents  of  registers 
R1 , R2,  R3,  • • • to  be  printed  out  whenever  microinstructions  i,  j,  k,  • • • are 
encountered.  HIST  causes  a histogram  showing  the  number  of  times  each 
microinstruction  was  performed  to  be  printed  out  upon  successful  termina- 
tion of  the  simulation. 

$DUMP 

After  this  follows  one  or  more  memory  dump  specifications. 

$END 

This  card  is  used  to  terr  ■ 'ute  the  current  job. 
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After  a simulation  has  terminated,  a user  may  simulate  the  running  of  a new  pro- 
gram on  the  previously  defined  machine  by  simply  following  the  $RUN  card  with  a re- 
initialization of  the  key  registers  using  $ INITIAL,  the  entering  of  a new  program  using 
$ PROGRAM,  and  a new  simulation  using  $RUN.  Because  of  its  modular  structure, 
MDSL  allows  control  sections  to  be  used  and  reused  in  a variety  of  ways. 

C.  Basics  of  the  MDSL  Language 

The  basic  syntax  and  semantics  of  MDSL  will  be  introduced  by  means  of  a set  of 
constructs  taken  from  the  complete  example  given  later  in  this  report,  where  a set  of 
processors  asynchronously  traverse  a network  in  order  to  determine  the  minimum  cost 
path.  A detailed  description  of  the  syntax  is  given  in  Reference  3. 

All  information  storage  components  (registers,  flags,  and  memories)  are  defined 
in  the  $ DEF  section  using  the  following  basic  format; 


identifier  ; = storage  specification 

Below  are  three  examples  that  illustrate  three  basic  variations  of  the  above  general 
definition. 

PTR  : = <0:  8>  /POINTER  TO  THE  UNASSIGNED  MEMORY 

In  the  above  a 9- bit  register  witli  bit  positions  numbered  from  0 to  8 is  defined 
and  assigned  the  name  PTR.  The  / indicates  the  beginning  of  a comment;  which  will  be 
assumed  to  run  to  the  end  of  the  current  card. 

MBR.8  ; = <0:  7>  /MEMORY  BUFFER  REGISTER  FOR  BOTH  MEMORIES 

In  the  above  a set  of  eight  8-bit  registers  with  the  generic  name  MBR  are  defined; 
each  with  bit  positions  numbered  from  0 to  7. 

PATH  ; = (0;  511)<0:  7>  /WILL  CONTAIN  THE  PATHS  AND  PATH  LENGTHS 

In  the  above,  a memory  named  PATH  is  defined  to  have  512  words  (numbered 
from  0 to  511)  of  8-bits  each  (numbered  from  0 to  7). 

In  the  $ STATIC  section  all  the  controllable  data  flows  are  defined  and  given  labels 
using  the  following  basic  format: 


label:  source(s)>  flow  modifier>  target(s) 

This  will  cause  the  source(s)  to  be  transferred  to  the  target(s)  under  control  of  the  flow 
modifier.  The  following  examples  demonstrate  some  of  the  possible  variations  of  this 
basic  form. 
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6:  MBR»NODE  /SEND  MBR  TO  NODE 

The  above  defines  control  line  6 to  control  the  transfer  of  the  contents  of  register 
MBR  into  the  NODE  register. 

7;  1»SWT  /RESET  SWT  TO  1 

Control  line  7 causes  the  constant  1 to  be  sent  to  the  SWT  register. 

20;  F.SWT>SET>F.SWT  /SET  FLAG  OF  CONTROLLED  MACHINE 

In  the  above,  control  line  20  controls  the  setting  to  one  of  the  F registers  of 
machine  SWT. 

13:  MBR.  COST>  ADD>  COST  /UPDATE  COST 

If  control  line  13  is  activated,  then  the  contents  of  the  MBR  and  COST  registers 
will  be  added  and  the  result  will  be  stored  in  COST. 

2:  PATH(MAR)»MBR  /READ  FROM  PATH  INTO  MBR 

Control  line  2 controls  the  transfer  of  the  contents  of  the  location  of  memory 
PATH  pointed  to  by  the  MAR  register  into  the  MBR  register. 

21:  00000000»PATH(=!=)  /CLEAR  MEMORY  INITIALLY 

As  defined  above,  control  line  21  sets  to  zero  all  words  of  the  memory  named 
PATH.  The  * symbol  is  used  to  mean  "all  possible  values." 

The  general  format  for  a microinstruction  defined  in  $ MICRO  is  as  follows: 


labeb  (control  line  list)(condition)(branch  specification) 

The  following  examples  demonstrate  some  of  the  variations  of  the  above  format. 

2;  (5,3,4)  /READ  CURRENT  NODE,  STORE  IN  PATH 

The  above  microinstruction,  labelled  2,  causes  control  lines  5,  3 and  4 to  be  acti- 
vated. Since  no  condition  is  specified,  the  next  microinstruction  will  be  the  next  one 
in  sequence,  that  is,  microinstruction  3. 

11:  ( )(F.SWT  = 1)(11)  /LOOP  IF  CONTROLLED  MACHINE  IS  NOT  READY 

Microinstruction  11  above  will  cause  no  control  lines  to  be  activated.  It  will  check 
if  the  F register  of  machine  SWT  is  1 . If  it  is,  the  next  microinstruction  will  be  num- 
ber 11.  Otherwise,  the  next  will  be  number  12. 


17:  (3)(1)(0)  /STORE  COST  AND  HALT 

Microinstruction  17  is  a special  case  of  a computed  goto  instruction.  It  causes 
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line  3 to  be  activated  and  then  goes  to  the  first  branch  address  if  the  value  in  the  condi- 
tion field  is  0,  the  second  branch  address  if  the  value  is  1,  and  etc.;  with  the  last 
branch  address  being  the  goto  address  for  all  remaining  values  in  the  condition  n field. 

In  this  case,  there  is  an  effective  unconditional  transfer  to  microinstruction  0.  Trans- 
ferring to  microinstruction  0 halts  the  given  machine.  When  all  machines  have  halted, 
the  simulation  caused  by  the  $RUN  card  terminates. 

The  $ INITIAL  and  $ PROGRAM  sections  enable  one  to  initialize  registers  and 
memories.  The  following  examples  demonstrate  the  formats  for  these  initializations. 

$ INITIAL 

C = 0 /INITIALIZE  COST  TO  ZERO 

The  above  causes  the  COST  register  to  be  initialized  to  0.  Unless  otherwise  stated, 
all  constants  are  assumed  to  be  binary. 

M.  1 = 1 /MICRO-INSTRUCTION  COUNTER  OF  MC.  I SET  TO  1 

This  causes  the  microinstruction  counter  of  machine  I to  be  initialized  to  1.  The 
microinstruction  counters  of  all  machines  are  automatically  built  in  and  available  for 
use  in  a microoperation  or  test  condition.  Because  of  this,  it  is  possible  to  implement 
microsubroutines  and  complex  interrupts  in  MDSL. 

$ PROGRAM 

supplied  by 

MDSL  *TREE(1) 

1 00000001  /NODE  NO.  1 

25  00000000  /NO  CHILDREN;  TERMINAL  NODE 

The  above  will  cause  the  memory  TREE  to  be  loaded  with  0000001,  • • • , 00000000 
• starting  with  location  1 and  going  to  location  25. 

The  general  format  for  a dump  specification  in  the  $ DUMP  section  is  as  follows: 

'f  ■ 

name  (lower  address:  upper  address) 

For  example, 

PATH(0:  99)  /DUMP  OUT  LOCATION  0 to  99 

This  will  cause  the  contents  of  memory  locations  0 through  99,  inclusive,  of  memory 
PATH  to  be  printed  out.  Unless  OCTAL  is  specified  on  the  $DUMP  card,  the  output 
will  be  in  binary. 

f 


»k>  »4 
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D.  Illustrative  Example 

An  example  which  demonstrates  some  of  the  most  significant  features  of  MDSL  is 
one  in  which  an  array  of  identical  processors,  operating  asynchronously,  examines  the 
set  of  all  paths  through  a loopless  network  in  order  to  find  the  minimum  cost  path  through 
the  network.  Figure  3 shows  the  network  to  be  examined.  A cost  is  associated  with 


START 


Fig.  3.  Network  to  be  followed. 

each  of  the  nodes.  There  is  one  start  node,  node  1 and  one  terminal  node,  node  7.  An 
analysis  of  the  network  reveals  that  there  are  six  unique  paths  from  node  1 to  node  7. 

The  minimum  cost  path  is  found  to  be  1-3- 4- 7,  with  a cost  of  18. 

Figure  4 provides  a diagram  of  the  configuration  of  the  machines  and  a detailed 
layout  of  a representative  machine.  (The  MDSL  simulation  program  is  given  at  the  end 
of  this  report.)  The  circled  numbers  associated  with  the  various  control  lines  in  Fig. 

4 correspond  to  the  microoperation  labels.  For  example,  line  20  sets  the  F register  of 
a machine  to  1.  This  corresponds  to  the  20:  F.  SWT>  SET>  F . SWT  entry  in  the  $ STATIC 
section  of  the  MDSL  program. 

The  multiprocessor  example  which  required  two  simulations  demonstrated  the 
capability  within  MDSL  of  modifying  and  rerunning  complex  machines.  The  first  simu- 
lation began  with  the  description  of  the  internal  structure  of  one  of  the  machines  and  the 
network  to  be  followed.  The  definition  allowed  up  to  eight  machines  to  be  used.  When 
the  simulation  was  completed,  it  was  found  that  only  six  machines  vere  needed.  Next 
the  $DEF  was  used  to  add  one  register  to  the  basic  structure  defined  before,  ZLIM; 
which  would  be  used  to  contain  the  limit  on  the  number  of  machines  allowed  to  be  used. 

In  this  way,  we  make  the  machine  limit  a variable.  The  use  of  ZLI.M  to  limit  the  number 
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of  machines  was  structured  into  the  definition  of  the  network  by  using  the  $ MICRO  to 
modify  previously  defined  microinstruction  six  to  read  - 6:  (8)(SWT  = ZLIM)(8)  /CHECK 

IF  NO.  OF  MACHINES  USED  IS  = LIMIT.  After  the  reinitialization  step,  the  modified 
network  was  simulated  again.  Since  ZLIM  was  initialized  to  three,  only  three  machines 
were  allowed  to  follow  the  network.  The  results  show  that  the  same  six  unique  paths 
were  examined  and  the  same  minimum  cost  path  was  found.  It  should  be  noted  that  if 
ZLIM  was  initialized  to  two,  an  unresolvable  deadlock  would  have  developed  in  the  net- 
work being  simulated. 


Fig.  4.  Processor  architecture  . 
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E.  Summary  j 

I 

The  basic  structure  and  syntax  of  the  MDSL  computer  hardware  design  and  simu-  I 

lation  language  has  been  described.  The  language  is  easy  to  learn  and  is  based  on  the 
classic  Wilkes  model  of  a microprogrammable  system.  MDSL  is  currently  being  used 
in  both  an  educational  and  a research  environment.  With  MDSL  a researcher  can  de- 
scribe and  simulate  a variety  of  computers  and  computer-like  structures,  "debug"  his 
design,  and  investigate  the  tradeoffs  involved  in  various  design  alternatives.  Two 
areas  that  are  currently  being  explored  involve  the  design  and  use  of  highly  and  loosely 
parallel  structures,  such  as  content  addressable  parallel  processors  and  networks  of 
processors,  and  the  investigation  of  microprogramming  structures. 

MDSL  is  currently  a developing  language;  new  language  constructs  are  still  being  j 

added.  Because  of  its  modular  form,  it  is  relatively  easy  to  add  new  features  to  the 
language  in  such  a way  as  to  allow  for  upward  compatability. 
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MDSL  Program  for  Example 
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CENTRALIZED  TELEPROCESSING  NETWORK  DESIGN;  A NEW  HEURISTIC 
R.R.  Boorstyn  and  A.  Kershenbaum 

The  problem  considered  is  that  of  finding  an  optimal  (minimum  cost)  design  for  a 
centralized  telecommunication  network  given  a set  of  locations,  traffic  magnitudes  be- 
tween these  locations,  and  a single  common  source  or  destination. 

Several  heuristics,  which  are  efficient  (in  terms  of  their  execution  time  and  mem- 
ory requirements  on  a digital  computer)^’ and  which  produce  seemingly  good 

results,  have  already  been  developed  and  are  currently  accepted  techniques.  Some  work 

12  3 

has  also  been  done  on  finding  optimal  solutions  ’ ' to  this  problem  as  a means  of  ver- 

7 

ifying  the  effectiveness  of  proposed  heuristics.  Kershenbaum  and  Boorstyn  have  de- 
veloped a technique  which  is  capable  of  handling  problems  of  realistic  size  by  treating 
the  problem  within  the  context  of  matroid  theory.  On  the  basis  of  experiments  run  using 
this  algorithm  for  the  exact  solution,  an  improved  heuristic  was  developed  which  is 
shown  in  this  report  to  yield  significant  improvements  (over  known  techniques)  in  the 
quality  of  obtainable  solutions  while  still  requiring  only  a modest  amount  of  computer 
time . 

More  formally,  the  problem  is  one  of  finding  a minimum  spanning  tree  subject  to 
one  or  more  constraints  which  in  general  are  equivalent  to  demanding  that  the  sum  of 
the  traffic  associated  with  the  nodes  in  any  subtree  must  not  exceed  some  predetermined 
maximum. 

A minimum  spanning  tree  is  a loop-free  collection  of  arcs  joining  a set  of  nodes 
such  that  the  sum  of  the  lengths  of  the  arcs  is  minimal.  In  the  case  of  a communication 
network,  these  collections  of  arcs  are  called  multidrop  lines. 

This  constraint  form  is  quite  general  and  encompasses  many  real-world  constraints 
which  arise  in  the  design  of  centralized  telecommunication  networks.  Thus  for  example, 
in  addition  to  treating  the  obvious  constraint  imposed  by  line  capacity,  it  is  possible  to 
treat  a restriction  on  the  number  of  terminals  on  a multidrop  line  by  associating  a uni- 
form traffic  with  each  terminal. 

Also,  the  length  (cost)  functions  which  can  be  treated  are  quite  general.  Any  func- 
tion which  is  not  a function  of  the  tree  chosen  is  permissible. 

A.  Problem  Formulation 

Given: 

(1)  A vertex  (node)  set  V = {vj  i = 0,  1,  • • , n}  representing  the  terminal  locations 
in  the  network,  where  node  vq  is  a distinguished  node  which  we  will  refer  to 
as  the  center, 

(2)  A symmetric  distance  function  D ={dij  | i,  j = 0,  1 , • • , N}  representing  the  cost 
(length  of  an  arc),  and 
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(3)  A constraint,  m,  on  the  number  of  nodes  which  may  share  a multidrop  line. 
Minimize 


subject  to  (3)  where  Vp  is  the  immediate  predecessor  of  in  T,  i.e.,  the  node  closest 
to  V.  on  the  path  between  V.  and  V.  in  T and  T is  any  spanning  tree. 


We  define  a multidrop  line  to  be  a subtree,  T^,  rooted  at  the  center,  V^,  and  de- 
fine the  set  of  nodes  in  T.,  to  be  V..  Thus,  the  constraint  can  be  stated  as  I V.  I < m 

J J J 

for  every  V.,  for  the  uniform  traffic  case  and 
J 


Z TR,  < m 
k 

V,  £ V. 
k J 


for  the  general  case  with  TRj^  the  traffic  associated  with  Vj^. 

We  will  refer  to  this  problem  as  the  capacitated  minimum  spanning  tree  (CMST) 
problem. 


B.  Heuristic  Solutions 

The  most  often  used  heuristic  solutions  to  the  CMST  problem  share  the  following 
properties:  (1)  their  running  time  is  a polynomial  (usually  of  order  between  2 and  3)  in 
the  number  of  nodes,  N,  and  is  independent  of  the  constraint;  (2)  in  the  absence  of  con- 
straints, they  will  yield  a minimum  spanning  tree  (MST);  and  (3)  the  quality  of  the 
solution  (i.e.,  the  amount  by  which  it  differs  from  the  optimum)  is  not  controllable  and, 
except  very  loosely,  is  not  known. 

These  heuristics  can  be  placed  in  one  of  two  classes;  primal  solutions,  where  one 
starts  with  a feasible  solution  to  the  problem  and  then  enlarges  or  improves  it;  and  dual 
solutions,  where  one  starts  with  an  (infeasible)  MST  and  then  generates  modifications 
to  make  the  solution  feasible. 

g 

Primal  algorithms  are  the  more  often  used.  Kershenbaum  and  Chou  have  pointed 
out  that  all  such  algorithms  fall  into  the  class  of  MST  problems  constrained  by  traffic 
or  response  time  requirements.  The  difference  between  them  is  mainly  the  sequential 
order  with  which  a branch  or  a line  is  selected  into  the  tree.  Without  the  constraints, 
all  the  algorithms  converge  to  a MST.  With  the  constraints,  they  form  different  sub- 
trees. 

4 

We  describe  below  the  Esau- Williams  algorithm,  which  is  used  in  the  new  heuris- 
tic to  generate  solutions,  and  Karnaugh's  Second  Order  Greedy  Algorithm  (SOGA),^  which 
provided  the  motivation  for  the  branch  exchanges  used  in  the  new  heutistic. 
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In  the  Esau- Williams  algorithm,  initially  each  node  is  placed  in  a separate 

component,  C ^ . A tradeoff  function,  t_,  is  initialized  for  each  arc , a^y  hj=d^j-d^Q; 

where  d. . is  the  distance  or  more  generally  the  cost  of  an  arc  from  v.  to  v.. 
ij  1 J 

The  arc,  a.  ...,  for  which  t. . is  minimum  is  then  sought  and  tested.  If  v and  v 

ij  1 j 

are  both  in  the  same  component  or  if  the  two  components,  when  combined,  violate  the 
problem  constraints,  the  arc  is  discarded  and  is  set  to  a very  large  number. 

Otherwise,  added  to  the  solution  and  the  components  containing  v.^  and  v.^  ai 


merged.  Values  of  t^^  are  then  recalculated  by  setting  t^^j  ~‘^ij~  ‘^kO’  ‘^kO 

minimum,  over  all  nodes  t in  the  newly  formed  component,  of 


The  procedure  of 


seeking  and  testing  arcs  with  minimum  t^^  is  repeated  until  a complete  solution  is  found. 

This  procedure  gives  reference  to  nodes  which  are  far  from  the  center.  Such 
nodes  are  difficult  to  treat  in  the  sense  that  if  one  waits  too  long,  it  may  be  necessary 
to  connect  them  directly  to  the  center  using  a long  arc. 


In  Karnaugh's  proposed  second  order  greedy  algorithm,  one  iterates  any  primal 
CSMT  algorithm,  successively  forcing  selected  arcs  in  or  out  of  the  solution,  thus  ob- 
taining a sequence  of  solutions  of  improving  quality.  The  increase  in  running  time  is  a 
function  of  how  many  combinations  of  branches  are  forced  in  and  out  of  solution.  If  one 
considers  (potentially)  all  combinations,  a branch  and  bound  procedure  results. 

Karnaugh  said  little  about  how  one  selects  the  arcs  to  force  in  or  out.  The  new 
heuristic  presented  here  is  essentially  an  outgrowth  of  this  technique  where  a small  sub- 
set of  likely  candidates  for  forcing  are  identified  in  the  course  of  the  algorithm.  It 
thus  becomes  possible  to  obtain  significant  improvements  in  the  quality  of  the  solution 
without  unreasonably  increasing  the  algorithm's  complexity  beyond  the  point  of  practical 
applicability. 


C.  A New  Heuristic 

Recently,  an  improved  technique  for  finding  optimal  solutions  to  CMST  problems 
was  developed.^  By  using  the  technique  it  was  possible  to  obtain  optimal  solutions  for 
a class  of  interesting,  tightly  constrained  problems  and  make  observations  about 
characteristics  of  optimal  solutions. 

For  a CMST  problem  with  18  nodes  and  m = 3,  Fig.  1(a)  is  the  solution  yielded  by 
the  Esau- Williams  algorithm.  Figure  1(b)  is  the  unconstrained  MST  on  the  same  prob- 
lem. Figure  1(c)  is  the  optimal  solution.  Note  that  the  two  arcs  present  in  the  optimal 
solution,  but  not  present  in  the  Esau- Williams  solution,  are  (G,  O)  and  (B,K)  and  are 
MST  arcs.  While  it  is  not  necessarily  so  that  all  arcs  present  in  the  optimal  solution 
and  not  in  the  Esau- Williams  solution  are  MST  arcs,  in  all  the  experiments  run  and 
checked  to  this  end,  invariably  at  least  one  arc  in  the  optimum  but  not  in  the  Esau- 
Williams  solution  was  an  MST  arc. 


i 

1 


Ip- 
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Fig.  1,  Comparison  between  Esau- Williams'  and  optimal  solution. 

This  is  not  to  say  that  it  is  conjectured  that  one  must  always  be  able  to  find  an 
optimum  solution  which  contains  an  MST  arc  not  part  of  the  Esau- Williams  solution 
(when  the  Esau- Williams  solution  itself  is  not  optimal).  It  merely  strengthens  the  point 
that  MST  arcs  are  reasonable  arcs  to  examine  when  seeking  to  improve  a heuristic  so- 
lution. 

Even  without  this  evidence,  however,  it  is  intuitively  appealing  to  consider  such 
arcs.  Most  primal  heuristics  share  the  property  that  they  proceed  by  connecting  nodes 

g 

to  their  nearest  feasible  neighbors.  The  algorithms  differ  only  in  the  order  in  which 
they  consider  the  nodes.  Thus,  the  solutions  differ  only  in  that  the  distance  to  a node's  ; 

nearest  feasible  neighbor  may  differ  depending  upon  when  the  node  is  considered  during  j 

the  execution  of  the  algorithm.  If  such  a heuristic  fails  to  find  the  optimum,  it  will  be 
because  the  algorithm  erred  by  waiting  too  long  before  consideration  of  some  critical 
node  (or  nodes)  and  hence  missed  being  able  to  include  some  desirable  arc  (or  arcs)  in 
the  solution.  It  is  quite  likely  that  one  or  more  of  these  arcs  will  be  MST  arcs;  indeed, 
as  mentioned  above,  this  was  always  the  case  with  the  experiments  carried  out. 

Thus,  a heuristic  is  proposed  which  attempts  to  improve  solutions  generated  by 
some  primal  heuristic  by  forcing  the  inclusion  of  one  or  more  MST  arcs  which  the  pri- 
mal heuristic  left  out.  This  gives  rise  to  the  following  procedure: 


h* 
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Step  1;  Generate  a MST,  = {a^  |j  - 1,  •••,  n - 1}  . Generate  a feasible  tree, 

T^  = {b.  |j  = 1,  • • - .n-  1}  . Find  S = T^  - Tj(=  T^  - T^nT^).  (For  problems  of  interest, 

S is  nonempty) . 

Step  2;  For  each  suS^set  SjC  S:  setS2=S-Sj.  Remove  all  a^  c S^  from  the  net- 
work. Start  a solution  by  including  all  a.e  Sj.  Apply  the  primal  heuristic  to  complete 
the  generation  of  a solution. 

It  should  be  noted  that  an  alternative  procedure  would  be  to  include  a^  in  S^  but  not 

to  exclude  a.  in  S,.  It  was  felt,  however,  that  if  elements  in  S,  were  retained  as  candi- 
J ^ . 

dates  for  inclusion,  duplicate  solutions  would  be  likely  to  result.  The  exclusion  of 

elements  in  S2  guarantees  unique  solutions  and  hence  should  increase  the  likelihood  of 

generating  improvements. 

A version  of  this  procedure  was  implemented  using  the  Esau- Williams  algorithm 
as  the  primal  heuristic.  The  Esau- Williams  algorithm  was  modified  to  accept  forced 
and  removed  arcs.  This  is  a straightforward  affair.  Removed  arcs  have  their  lengths 
set  to  a very  large  number;  forced  arcs  are  brought  into  the  solution  at  the  start  as  if 
the  Esau- Williams  procedure  has  selected  them  itself.  Note  that  it  is  possible  for  the 
forced  arcs  to  imply  infeasibility;  this  condition  was  tested  for  explicitly. 

Problems  were  generated  by  reading  in  n and  m and  generating  random  x and  y 
coordinates  for  the  nodes  within  a unit  square.  The  location  of  the  center  was  random, 
centered,  or  in  the  corner,  dependent  upon  the  value  of  an  input  parameter.  Euclidean 
distances  were  used. 

A variety  of  problems  were  run  with  two  goals: 

(1)  To  see  hjDw  the  new  heuristic  compared  with  the  optimum;  and 

(2)  To  see  l?ow  the  new  heuristic  compared  with  known  heuristics. 

Problems  in  the  first  class  were  relatively  small  (n  30)  and  somewhat  loosely 
constrained.  In  brder  to  conserve  computer  time  the  problems  were  chosen  from  among 
those  where  the  known  heuristics  performed  most  poorly.  The  new  heuristic  hit  the 
optimum  fairly  consistently.  In  the  thirty  trials  that  were  run,  in  only  two  cases  did 
the  new  heuristic  miss  the  optimal  solution;  in  both  cases,  the  new  heuristic  came  with- 
in 1%  of  the  optimum. 

To  see  how  the  new  heuristic  compared  with  known  heuristics,  a broad  range  of 
problems  for  n ranging  as  high  as  100  and  m varying  from  n/3  to  n/ 10  were  examined. 

Most  primal  heuristics  generate  subtrees  which  are  MST's  on  the  set  of  nodes 
they  contain  along  with  the  center.  Most  of  the  arcs  in  these  subtrees  (with  the  exception 
of  the  arc  directly  connected  to  the  center)  will  be  MST  arcs.  Thus,  one  might  expect 
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that  as  |S|  = a vast  improvement  can  be  made  over  a blind  branch  exchange  or 

SOGA  procedure.  However,  this  turns  out  to  be  slightly  conservative.  In  practice,  for 

n < 20  and  v-  in  the  geographic  center,  and  for  n < 15  and  v.  in  the  corner,  ls|  = — . 

u . U ' ' m 

For  larger  networks,  jsj  ranges  between-^  and  ” ^ ~ > where  fs(  for  networks  with 
noncentered  roots  is  generally  larger  than  for  the  same  network  with  v^  in  the  center. 

I s I 

There  are,  in  fact,  2 subsets  of  S.  Thus,  for  moderately  large  tightly  con- 
strained problems,  S could  grow  large  enough  to  make  evaluating  all  subsets  impractical. 
Furthermore,  it  is  reasonable  to  assume  that  not  all  arcs  in  S interact  with  one  another: 
i.e.,  improvements  in  separate  parts  of  the  network  can  be  found  and  justified  independ- 
ently of  one  another.  Further,  in  running  the  experiments,  it  was  found  that  if  a group 
of  arcs  yielded  an  improvement  when  forced  into  the  solution,  subsets  of  these  arcs 
often  yielded  improvements  too.  Thus,  the  heuristic  was  modified  by  only  considering 
subsets  S such  that  | | < K for  some  given  K.  The  best  subset,  S*,  is  found  and 

permanently  forced  into  the  solution.  We  then  set  5 = 3-3*  and  repeat  the  procedure 
until  no  further  improvement  can  be  made.  As  multiple  branch  exchanges  are  often 
required  to  effect  improvements,  the  value  of  K chosen  should  be  large  enough  to  ensure 
that  most  advantageous  exchanges  will  be  found  and  at  the  same  time  small  enough  to 
keep  the  procedure  computationally  tractable;  K = 2 seemed  to  work  well.  Experiments 
were  run  with  larger  values  of  K;  only  in  isolated  cases  was  any  improvement  over  K = 2 
obtained;  in  no  case  was  an  improvement  in  excess  of  1%  observed.  Thus,  it  was  de- 
cided in  all  the  remaining  experiments  to  use  the  extended  procedure  (which  proceeds 
from  the  best  solution  found)  and  restrict  the  examination  to  subsets  of  cardinality  less 
than  or  equal  to  two. 

Using  this  modification  of  the  new  heuristic  on  a large  number  of  problems,  it 
was  observed  that  in  general,  forcing  arcs  between  nodes  which  are  close  to  the  center 
seemed  to  have  the  greatest  effect  on  the  value  of  the  solution.  This  was  probably  a 
consequence  of  the  fact  that  the  Esau- Williams  algorithm  starts  with  nodes  far  from  the 
center  and  hence,  dealing  with  nodes  near  the  center  first  radically  changes  the  solution 
value . 

Large  subsets,  in  some  cases  all  of  S,  sometimes  yielded  the  best  solutions. 

This  means  that  one  cannot  simply  consider  subsets  of  limited  size  without  iterating  to 
subsequently  force  additional  arcs  into  the  solution. 

In  particular,  it  was  found  that  for  small  networks  (n  < 20)  with  centered  v^,  there 
was  little  difference  (usually  ^1%)  in  performance  between  the  new  heuristic  and  the 
Esau- Williams  solution.  This  is  almost  certainly  because  both  procedures  were  gen- 
erating near-optimal  solutions.  However,  for  larger  networks  (up  to  n = 100)  and  cen- 
tered Vq,  particularly  for  tightly  constrained  problems,  the  new  heuristic  performed 
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Fig.  2.  Improvements  obtained  with  the  new  neuristic. 

improvement  obtained  using  the  new  heuristic  relaUve  to  the  Esau- Williams  algorithm. 
Each  point  on  the  curve  represents  an  average  over  12  sample  runs.  As  can  be  seen, 
the  improvement  increases  as  the  problem  size  does.  Significant  savings  are  obtain- 
able for  realistic  problems. 

Surprisingly,  the  improvement  was  only  weakly  correlated  with  problem  tightness 
(ratio  of  m to  n);  the  new  heuristic  achieved  slightly  larger  savings  in  more  tightly  con- 
strained problems.  Related  to  this,  is  that  substantial  savings  were  achieved  even  for 
very  loosely  constrained  problems.  In  such  cases,  the  constrained  solutions  differed 
only  slightly  from  the  MST,  as  expected.  What  was  startling  was  that  the  values  obtain- 
ed with  the  Esau- Williams  algorithm  sometimes  differed  from  the  MST  value  over  20% 
more  than  those  obtained  using  the  new  heuristic;  i.e.. 


V 

mst 


mst 


V' 

= — was  often  over 

nh 


1 . 2 


where 

V is  the  Esau- Williams  value 
ew 

V , is  the  new  heuristic  value 
nh 
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and 

V ^ is  the  MST  value, 
mst 

Figure  3 is  a plot  of  S vs.  N,  the  number  of  nodes.  Again,  it  is  an  increasing  function 
of  N. 


z 

UJ 

u 

a: 

Ui 

0. 


2 


J I I 1_  J I I 1 1 1_ 

20  40  60  80  100 

NUMBER  OF  NODES 


Fig.  3.  Adjusted  improvements  using  the  new  heuristic. 

Results  were  even  more  encouraging  for  large  networks  with  v^  in  the  corner, 
such  networks  may  be  viewed  as  one- fourth  of  a network  of  4n  nodes  with  v^  in  the 
center.  For  such  problems,  improvements  averaging  4%  and  as  high  as  8%  were  ob- 
served. Thus,  the  procedure  appears  to  be  of  value  for  many  realistically- sized  prob- 
lems with  tight  constraints. 

A straightforward  implementation  of  the  Esau- Williams  procedure  has  a compu- 

2 9 

tational  complexity  of  order  n log^n.  A more  careful  implementation  can  reduce  the 
complexity  to  order  n log2n.  As  discussed  previously,  | S|  was  found  to  range  between 

and  . Since  we  examine  subsets  of  cardinality  at  most  ( ^ ) subsets  are 

examined  on  each  iteration.  At  worst,  ^ successive  subsets  of  cardinality  could  be 

2 I si 

introduced  and  thus,  the  procedure  could  iterate  at  most  times.  In  practice,  the 
number  of  iterations  grows  more  slowly  than  | S|  . 
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Thus,  a careful  implementation  of  the  procedure  has  a complexity  of  | S|  logn. 
If,  for  instance,  |s|  = n^^^,  the  procedure's  complexity  is  log  n. 
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COMMUNICATION  NETWORKS  WITH  ADAPTIVE  CHANNEL  CAPACITIES 
K.  Sohraby,  L.  Shaw  and  R.  R.  Boorstyn 

We  consider  a network  of  nodes  interconnected  by  communication  channels.  It 
is  assumed  that  the  route  for  each  message  from  its  origin  node  to  its  destination 
node  has  been  chosen,  and  that  at  each  node  along  this  route  the  message  waits  in  a 
buffer  queue  until  the  channel  needed  for  its  next  step  is  free.  The  i buffer  has  a 
maximum  capacity  of  K^  messages  so  that  messages  arriving  at  a filled  buffer  will 
be  blocked  or  lost,  requiring  retransmission  at  a later  time. 

The  novel  idea  considered  here  is  that  the  network's  efficiency,  in  terms  of 
average  message  delay  and  channel  costs,  can  be  reduced  by  allowing  frequent  changes 
in  channel  capacity.  An  expensive  channel,  with  higher  capacity  might  be  employed 
when  the  buffer  is  nearly  full;  but  a slower,  less  expensive  channel  might  suffice 
when  few  messages  await  transmission.  Such  rapid  changes,  say  after  time  intervals 
in  which  a small  number  of  messages  might  arrive  or  depart,  would  be  feasible  in 
some  channels  currently  being  considered  by  communication  engineers.  We  refer 
here  to  a channel  which  is  shared  by  voice  and  data  users,  with  higher  priority  for 
the  data.  Adaptation  of  the  capacity  available  for  the  data  could  be  achieved  at  the  ex- 
pense of  occasional  reduction  in  the  capacity  available  for  the  less  important  voice 
users. 

Although  one  ultimate  goal  is  the  control  of  channel  capacities  for  a network  of 
interconnected  queues,  the  first  step  will  be  to  optimize  the  queue-dependent  selection 
of  channel  capacities  for  a single  queue. 

The  second  step  is  the  derivation  of  an  effective  short  term  departure  rate  for 
the  approximate  Poisson  departure  process  from  such  a controlled  queue.  This  de- 
parture model  will  then  be  used  to  describe  the  arrival  process  at  the  next  queue  in 
the  network. 

This  report  sum.marizes  results  achieved  for  these  first  two  stages  of  the  study, 
and  then  suggests  the  manner  in  which  these  results  will  be  used  for  the  overall  net- 
work optimization. 

A.  . Controlled  Queue 


A single  buffer  queue  is  modeled  as  an  M/M/1 /K  queue  i.  e.  , with  Poisson 
arrivals  (rate  X),  exponential  service  times,  a single  server,  and  a maximum  storage 
of  K messages.  We  use  Xj^  to  represent  the  state  (number  of  messages  waiting  or 
in  service)  at  time  tj^.  The  capacity  Uj^  used  during  the  interval  (tj^,  ‘k+1^ 
be  a function  of  the  state  x^^  at  the  beginning  of  the  interval.  Although  the  service 
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rate  may  change  sequentially,  the  queue  discipline  is  in  all  other  respects  just  like 
ordinary  first-come,  first- served  one.  This  is  in  contrast  to  an  alternate  model 
in  which  service  in  each  interval  is  restricted  to  those  messages  already  in  the  buffer 
at  the  beginning  of  the  interval. 

Three  types  of  decision-time  sequences  are  considered.  The  fixed  interval  con- 
trol (FlC)  procedure  has  equally  spaced  decision  times  tj^  = kT.  The  random  interval 
control  (RIC)  procedure  assumes  independent,  identically  distributed  (i.i.d)  inter- 
decision intervals,  which  might  represent  a situation  where  the  decision  computer  is 
time- shared  with  other  functions.  Finally  a continuous  control  (CC)  procedure,  which 
considers  capacity  changes  after  every  arrival  or  departure,  is  analyzed  for  compari- 
son (lower  bound)  with  the  other  two  more  feasible  cases. 

Defining  the  sequence  of  decision  intervals  T^^  = - tj^),  we  have 

FIC:  T , = T all  k 

k 

RIC:  T,  i.i.d  and  independent  of  { x.  } 

K J 

f-  (t)  = y e”*"*  , t > 0 

k 

CC:  T i.  i.  d and  indep  of  { X.  } 

K J 

f (t)  = pe'P‘  , t > 0 
^k 

where  p > X + 

3 

In  the  CC  case  we  have  used  the  scheme  of  Lippman  to  convert  the  original  problem 
with  state-dependent  inter-decision  intervals  to  one  with  state -independent  intervals. 
In  the  converted  model  the  mean  interval  durations  are  shorter,  and  there  is  a non- 
zero probability  that  the  state  does  not  change  at  each  of  these  newly  defined  random 
times. 

A cost  criterion  is  essential  if  operational  efficiency  is  to  be  optimized.  The 
mean  cost  corresponding  to  operation  over  the  interval  Tj^  , starting  in  state  Xj^  = i 
is  taken  as 

C.(k)  = Mean  message  delay  during  Tj^  + a[mean  channel  cost  during  Tj^  ) . 

+ 3 (Expected  number  of  blocked  messages  during  Tj^)  (1) 


Where  a and  3 are  weighting  factors.  Expressions  for  each  of  the  terms  in  Eq.  (I) 
will  be  defined  separately  for  each  of  the  three  cases  being  studied.  The  ultimate 
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desire  is  to  minimize  the  steady  state  mean  loss  rate  , of  the  form 


Z.  S 

*0  = W 

1 r=0  r 

' J 

C=  lim  7 E^  C„  (r)|x„=i>  (2) 

k — 00  o' 

which  will  turn  out  to  be  independent  of  the  initial  state  i. 

Given  a finite  set  of  possible  service  rates  E=  { c j , e ...  a stationary 

decision  policv  will  be  a function  which  assigns  an  e . to  p.  = p (x,  = i)  for  each 

J l K 

i = 0.  1,  . . . K . 

Since  there  is  a finite  number  {M  ) of  different  policies,  it  is  clear  that  an 
optimal  one  must  exist  to  minimize  C . Howard's  Dynamic  Programming  procedure 
is  an  efficient  way  to  find  the  best  policy. 

The  analysis  here  is  simplified  by  the  fact  that  a stationary  policy  is  assumed. 
Thus,  the  mean  one-step  cost  C^(k)  in  Eq.  (1)  is  not  a function  of  k ; the  sequence 
{ Xj^  } is  an  ergodic  Markov  chain;  and  the  loss  rate  in  Eq.  (2)  is  independent  of  the 
initial  state  Xg  = i.  That  limiting  mean  loss  per  step  can  also  be  written  as 


4 


f = y IT.  C. 

j=0  J J 


(3) 


with  the  aid  of  the  steady  state  probabilities  iTj  = P[Xj^  = j]  defined  by  the  conditions 


i .P..  . 

J i=0  ^ ‘J 


(4) 


The  stationary  one-step  transition  probabilities  Pj^j  will  later  be  defined  differently 
for  each  of  the  three  cases. 

Howard's  procedure  examines  the  mean  total  n-step  loss 

k=0  k 


k.  (n)  = E [ C^_  (k)  I Xg  = i] 


(5) 


In  view  of  the  limiting  constant  loss  rate  in  Eq.  (2),  we  have,  for  large  n,  approximate- 


ly linear  kj^  (n) 


kj  (n)  « n C + m^ 


(6) 
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with  intercepts  m^.  Substitution  of  Eq.  (6)  into  Eq.  (5)  leads  to  the  set  of  (k  + 1)  equa- 
tions in  (k  + 2)  unknowns;  Cj.nriQ,  rrij  , ...  . 

C = C.  + / P. . rn.  - m.  ; i = 0,  1 . . . k (7) 

^ j=0  J ^ 

(C  and  P . will  be  computable  for  a given  policy,  and  they  are  thus  known  when 
I U _ 

solving  Eq.  (7)  for  C.  ) It  is  clear  that  Eq.  (7)  is  unchanged  if  the  same  constant  is 
added  to  every  m^  , so  we  can  set  mp  = 0 and  reduce  the  number  of  unknowns  to 
equal  the  number  of  linear  equations. 

Rather  than  solving  Eq.  (7)  for  C for  each  possible  policy,  Howard  has  shown 
that  the  followup  iterative  procedure  conveys  efficiently  to  the  optimal  policy. 

Policy  Improvement 

For  each  i,  find  the  c . = p.  which  minimizes  the  test  quantity 


Ci  (p.) 


^li 


in  which  the  m.  from  the  previous  policy  are  maintained.  The  resulting  Pj^ 
becomes  a part  of  the  improved  policy,  and  the  corresponding  Cj^  and  P..  are 
used  in  the  next  value  determination. 


Value  Determination 

given  C.  and  Pj^^  for  a given  policy,  solve  (7)  for  C and  m^^  (m^  = 0)  . 

The  iteration  is  stopped,  and  the  optimal  policy  achieved  wken  no  policy  changes 
are  found  in  the  Improvement  Routine. 

B.  Fixed  Interval  Control 

In  this  case  the  instantaneous  transition  probabilities  can  b^found  as  the  solut- 
tion  of  the  birth-death  equations  for  the  finite  storage  queue. 


pQ  (t)  = - X p^(t)  + p.  Pj(t) 

p.  (t)  = - {X+  p.)  p.(t)  + Xp._j(t)  + p.  Pj^.i(t);  0 < j < K 


where  we  identify  p. . (Pj  ,t)  = p.(t)  in  Eq.  (8)  when  the  initial  conditions  are  p^(0)  = 1 
and  other  p.(0)  =0,  j ^ i.  The  one-step  transition  probabilities  needed  for  Eq.  (7) 
are  thus  P^j  = Py(P-i»  T). 


miJbl 
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In  this  FIC  case,  the  one  step  mean  cost  from  Eq.  (1)  becomes 
T K=1 

Ci  = ^ [ Xj  +i3  Pik<Pi't)]  dt  + oT  C(p.) 

where  C (p^)  = cost /unit-time  of  a channel  having  service  rate 
Pj^  = (mean  bits /message)  x (channel  capacity). 

C.  Random  Interval  Control 

Here  the  durations  Tj^  are  random  so 


using  the  p^j  (p^,  T^  ) generated  by  Eq.  (8).  Applying  the  expectation  integral  to  both 
sides  of  Eq.  (8)  produces  a set  of  simultaneous,  linear,  algebraic  equations  for  the 


The  Cj^  expression  also  simplifies  greatly  in  this  case  because  integration  by 


parts  yields 


r 1 

E f p..  (p.,  t)  dt  =-  P.. 

L -6  ' Jr  u 


1 r V ^ 

- X j p..  + 0 p.,  + o c(p.) 

I y j^o  ‘J  ^ i 


D.  Continuous  Control 

Here  we  postulate  Poisson  event  times  with  rate  p > ( X + transition 

probabilities 


P-  = - ; 0 < i < K 

i(i+l)  p ’ - 

Pi 

P...  ; I < i < K 

i(i-l)  p ' - - 

P..  = 1 - (X+  p.)/  p ; 0 < i < K 


Pqo  =•  1 - 7 


P = 1-  — 
kk  p 

P..  =0  otherwise 
ij 
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It  can  be  shown  that  this  process  is  equivalent  to  that  of  a channel-capacity-changing 
decision  at  every  arrival  and  departure  time.  The  stationary  policies  will  clearly 
not  change  capacity  at  those  event  times  where  no  state  change  occurs. 

In  this  case,  the  expected  one-step  cost  is 

C.  = - (i  + o C(p.))  i < K 

^ P ‘ (14) 

= ()3  + o C(p.))  i = K 

E.  Numerical  Examples 

Figure  1 compares  the  average  cost  per  unit  time  C,  as  a function  of  arrival 
rate  X , for  several  capacity -switching  policies. 

Here,  the  weighting  parameter  o-  = 0.  3,  /3  = 10,  the  maximum  number  of  mes- 
sages which  can  be  stored  is  K = 4,  the  available  service  rates  are  (0,  1, 2,  3,  4,  5,  6, 
7,  8,  9),  and  the  channel  cost  function  is  linear;  C(p.)  = (a.  . 

Clearly,  the  lowest  curve  must  correspond  to  the  continuous  control  case, 
which  is  the  most  flexible.  Moreover,  the  optimal  policy  in  that  case  is  of  the  bang- 
bang  form:  = (0,9,9,99)  . That  is,  we  should  use  [Iq  = 0 when  Xj^  = 0 and  the 

maximum  u.  = 9 at  all  other  times. 

i 

The  middle  curve  in  Fig.  1 corresponds  to  a fixed- interval  controller  with  T = 1. 


MESSAGES/UNIT  TIME 

Fig.  1.  Minimal  cost  for  various  capacity- switching  disciplines. 


*Ai>  >4 


COMPUTERS  AND  COMPUTER-COMMUNICATION  NETWORKS 


405 


The  most  costly  curve  in  that  figure  results  vvhen  capacity  switching  is  not  permitted, 
but  the  best  fixed  capacity  is  used  for  each  arrival  rate.  Table  I shows  the  optimal 
policies  for  these  two  cases,  as  functions  of  arrival  rate. 

TABLE  I.  Optimal  service  rates. 


Arrival  Rate 

X 

Optimal  jx 

Fixed  - Interval 

Optimal  ^ 
No  switching 

0.  1 

( 1,  2,  4,  5,  8) 

( 1. 

1.  1.  1. 

1) 

1 

( 1,  3,  5,  7,  9) 

( 3, 

3,  3,  3, 

3) 

2 

( 2,  4,  7,  7,  9) 

( 5, 

5,  5,  5, 

5) 

3 

( 4,  6,  7,  9,  9) 

( 6, 

6,  6,  6, 

6) 

4 

( 6,  7,  8,  9,  9) 

( 7, 

7,  7,  7, 

7) 

These  results  were  obtained  via  dynamic  programming  as  outlined  above.  The 
fixed- interval  case  also  required  numerical  integration  of  Eqs.  (8)  and  (9)  for  various 
test  policies. 

F.  Effective  Departure  Rate 

When  analyzing  a network  of  interconnected,  variable  service  rate  queues,  it 
will  be  necessary  to  characterize  the  departure  process  from  each  such  queue  during 
an  optimization  interval  of  duration  T.  As  a fist  step,  we  have  defined  and  evaluated 
the  mean  departure  rate  , for  the  fixed- interval  case,  as  a function  of  the  state  i 
at  the  beginning  of  the  interval. 

If  there  were  no  limit  on  the  size  of  the  "waiting  room"  (K=  oo)  we  would  define 
XAt  = P [an  arrival  in  any  At  interval] 

= # of  departures  during  interval  Tj^ 

to  get 

q.  = E { d^  I Xq  = i }/T  (15) 

= E { X + J Xdt  - X I X ^i}/T 

0 XV 

or 

n.  = X - [E{x^  I Xq  = i } - i ]/T  (16) 

5 

It  is  well  known  that,  for  K = oo. 


m 
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s[l4txP*(s)]  4X  -tx 
£ fE  {x^  |xq  i}]  = 2 


where 


p:o(8)  = £ { p.o(t)  } . 

Thus,  for  the  K = 00  case 

n.  = p ^ J Pio<‘)  ] (18) 

Curiously,  Eq.  (18)  applies  also  for  the  finite  - K case,  as  long  as  the  appropri- 
ate D (t)  is  used  (i.e.  , found  from  Equations  (8)).  To  see  this,  we  consider  the  modi- 
fication  of  Eq.  ( 1 5)  to  account  for  the  probability 

XAt  [ l-p^j^(t)]  = P [an  arrival  in  (t,t  + At)  |x^  = i] 

in  place  of  (X  At)  for  the  unbounded  case.  In  this  way 

T 

= X - [X  f p^j^(t)  dt  + E { X.J,  |xq  = i}  -i  ]/T  (19) 

It  has  been  found  that  the  expectation  in  Eq.  (19)  has  a Laplace  Transform 


8[i-X  P^(s)  + pPq  (s)]  + X - p 
2 


It  follows  that  Eq.  (18)  is  valid  here  also. 

The  general  shape  of  vs.  i is  monotone  increasing  toward  an  asymptote  of 
^ , If  "many”  messages  are  present  at  the  outset  (^q  ^ pT),  then  messages  will 
be  departing  continuously  at  the  channel  capacity  rate  of  p . 

G.  Network  Design 

In  a network,  the  arrival  process  at  one  queue  will  be  formed  as  a combination 
of  departure  processes  from  other  queues.  The  intention  is  to  optimize  the  network 
policy  by  sequentially  optimizing  the  individual  queues,  with  updating  of  the  arrival 
parameters  at  each  step  based  on  charges  in  the  departure  processes  from  the  other 


queues. 
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UPPER  BOUNDS  ON  THE  NUMBER  OF  TESTS  NEEDED  TO  VERIFY  A COMPUTER 
PROGRAM 

G.S.  Popkin 

The  upper  and  lower  bounds  on  the  minimum  number  of  program  test  cases  were 
discussed  in  Ref.  1,  with  application  to  flowcharts  containing  two-way  decisions.  The 
conditions  for  reaching  the  lower  and  upper  bounds  were  discussed,  and  examples 
given.  In  this  report,  these  ideas  are  extended  to  flowcharts  containing  three-way 
(e.g.,  A<  B,  A = B,  A>  B)  and  multi-way  decisions. 

A.  Upper  and  Lower  Bounds  on  the  Number  of  Tests  Needed  to  Verify  a Program 

Each  of  the  two  flowcharts  in  Fig.  1 contains  four  three-way  decisions.  The  num- 
bering of  the  segments,  with  two  segment  numbers  on  some  of  the  flow  lines,  indicates 
that  the  decisions  are  three-way.  In  Fig.  1(a),  the  methods  of  Reg.  2 would  yield  a 
maximum  incomparable  set  size  (and  hence  a lower  bound  on  the  minimum  number  of 
tests)  of  3.  It  will  be  shown  below  how  the  upper  bound  may  be  computed,  and  how  the 
flowchart  contents  can  be  inserted  to  raise  the  minimum  number  of  tests  required  to 
approach  the  upper  bound. 


Fig.  1 . (a)  A flowchart  with  the  maximum  incomparable  set  of  size  3. 

(b)  A flowchart  wit!i  the  maximum  incomparable  set  of  size  9. 

In  Fig.  1(b),  the  methocs  of  Ref.  2 yield  9 as  the  size  of  tii-,  maximum  incompara- 
ble set,  and  the  lower  bound  on  the  minimum  number  of  tests.  V is  also  the  upper 
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bound,  for  no  flowchart  contents  can  raise  th;;  minimum  number  of  required  tests  above 

9. 

1 . Minimum  Number  of  Tests  for  Charts  with  Three-Way  Deciders 

In  a loopless  flowchart  with  three-way  decisions,  the  upper  bound  on  the  minimum 
number  of  tests  needed  to  pass  through  each  segment  at  least  once  is  given  by 

u = 2d  + 1 

where  d is  the  number  of  deciders  in  the  flowchart. 

Proof:  Consider  a flowchart  with  no  deciders.  Such  a flowchart  consists  of  one 
segment  and  requires  one  test.  Each  three-way  decider  added  to  the  flowchart  can  re- 
quire at  most  two  additional  tests,  so  u = 2d  + 1. 

2.  Minimum  Number  of  Tests  for  Charts  with  Multi- Deciders 

In  a flowchart  where  the  deciders  may  have  any  different  numbers  of  outcomes, 
the  upper  bound  on  the  minimum  number  of  tests  needed  to  pass  through  each  segment 
at  least  once  is  given  by 


where  w^  is  the  number  of  outcomes  of  decider  i,  and  n is  the  number  of  deciders  in 
the  flowchart. 

Proof:  Consider  a flowchart  with  no  deciders.  It  consists  of  one  segment  and 
requires  one  test.  Each  decider  i can  require  at  most  w^  - 1 additional  tests,  so 

u = J (w  - 1)  + 1 
i = l 

n 

= y,  w.  - n + 1 
i=i  ^ 

as  asserted. 

Figure  2 portrays  the  flowchart  of  Fig.  1(a)  with  contents.  The  minimum  number 
of  tests  needed  to  pass  through  each  segment  at  least  once  is  now  7.  The  computed 
upper  bound  for  the  flowchart  is  9. 

In  the  deciders  in  Fig.  2,  the  segment  numbers  have  the  following  meanings: 
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Segment  No . 
1 
2 

4 

5 

7 

8 

10 
1 1 


Outcome 
P = 8 
P > 8 
P = 5 
P > 5 
P = 2 
P > 2 
P = 0 
P < 0 


I start! 


/read  p / 


1.2 


|P*P-2 


4.5 

<>  >r>— 


IP-  P-2  I 


T.e 

<>2  0 — 


I P-P-21 


10.11 

<20>^ 


jSTOPj 


Fig.  2.  A flowchart  with  contents. 


If  the  input  variable  P may  take  on  the  seven  values  1, 2,  4,  5,  7,  8,  and  9,  then  the 
flowchart  of  Fig.  2 would  require  seven  tests  to  traverse  each  segment  at  least  once. 


P Path  Traversed 

1 3-6-9-11 

2 3-6-7-10 

4 3-6-8-12 

5 3-4-8-12 

7 3-5-8-12 

8 1-5-8-12 

9 2-5-8-12 


SAFETY,  RELIABILITY  AND  SOFTWARE  ENGINEERING 


411 


The  above  illustrates  the  calculation  of  the  upper  bounds  on  the  number  of  tests. 
Work  is  continuing  on  obtaining  the  actual  number  of  tests  rather  than  just  a pessimis- 
tic upper  bound. 
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EXTENSIONS  OF  SOFTWARE  PHYSICS  TO  MEASURES  OF  COMPLEXITY 
E.  Berlinger  and  H.  Ruston 

In  his  work  on  software  physics,  M.  Halstead^  introduced  a measure  of  complexity 
based  upon  certain  program  parameters.  With  the  parameters: 

Nj  = number  of  distinct  operators 

N^  = number  of  distinct  operands 

the  measure  is 

V Nj  log^  Nj  + N2  log  N^ 

which  he  calls  the  program  volume.  Additional  definitions  and  theorems  are  also  given 
or  conjectured.  These  relate  the  effort  in  programming,  the  total  program.ming  time, 
and  estimate  the  program  length. 

Mathematically,  the  formulas  are  empirical  and  the  proofs  heuristic.  But  the 
results  verify  well  with  experimental  data.  The  present  effort  attempts  to  improve  on 
the  Halstead  scheme. 


A.  Outline  of  Current  Work 

One  of  the  principal  purposes  of  the  work  is  to  refine  the  definition  of  program 
volume  to  make  it  mathematically  more  sound,  and  also  to  include  frequency  of  usage 
of  the  various  program  constructs.  If  we  define: 

fj^  = frequency  of  usage  of  the  i**^  operator 

p.  = probability  of  usage  of  the  i^  operator 

L = frequency  of  usage  of  the  variable  whose  rank  is  j 

Pj  = probability  of  usage  of  the  variable  whose  rank  is  j 

we  can  then  define  a measure  of  complexity  as 

There  is  strong  justification  for  this  definition  from  an  information  theory  point  of  view. 
This  measure  should  correlate  well  with  the  number  of  bugs  in  a program.  If  so,  then 
the  number  of  bugs  can  be  predicted  from  an  initial  version  of  the  program. 


Work  is  currently  focused  on  automating  the  process  for  gathering  the  statistics 
necessary  to  obtain  the  p^  and  p ^ . To  this  end,  the  operating  system  of  an  IBM  370/125 
is  being  modified  to  copy  all  error-free  FORTRAN  student  programs  onto  a tape.  Stu- 
dent programs  collected  over  a full  semester  will  then  yield  the  necessary  probabilities. 


i 

i 
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Statistics  on  errors  will  also  be  collected  and  automatically  copied  onto  a second 
tape.  Specifically,  the  FORTRAN  error  numbers  will  be  obtained  from  the  output 
queue  before  printing.  These  will  give  the  syntax  and  run-time  errors.  To  obtain  a 
count  of  logical  errors,  a count  of  the  total  number  of  times  a program  is  run  is  being 
kept.  It  is  assumed  that  the  number  of  logical  errors  is  one  less  than  the  number  of 
runs  which  yield  no  syntax  or  run-time  errors.  This  may  be  an  underestimate  but  since 
the  programs  being  used  are  of  first  year  programming  students,  it  is  reasonable  to 
assume  that  they  only  find  one  logical  error  at  a time. 

To  obtain  the  frequency  counts,  the  tape  containing  the  error-free  runs  will  be 
run  through  a program  supplied  by  Professor  Halstead  and  Mr.  Ottenstein  of  Purdue, 
which  analyzes  FORTRAN  programs  and  yields  the  frequencies  automatically.  All 
programs  on  both  tapes  will  be  sufficiently  identified  so  that  the  bugs  can  be  correlated 
with  the  frequency  counts. 

Obtaining  the  probabilities  p^  and  p^  is  a secondary  purpose  of  the  project  and  will 
supplement  some  statistics  obtained  previously  by  D.  Knuth. 

B.  Tests  for  Obeyance  of  Zipf's  Law 

It  is  expected  that  the  probabilities  p^  and  p^  will  also  obey  Zipf's  law,  either  in 
its  pure  form  (i.e.,  p^  = c/r,  where  p^  is  the  probability  of  the  operator  or  operand 
whose  rank  is  r),  or  in  one  of  its  modified  forms  (e.  g. , p^.  = c/(r  + a)”).  If  the  frequen- 
cies also  follow  a Zipf's  law,  it  may  be  possible  to  get  a criterion  for  program  length. 
This,  however,  remains  to  be  seen. 
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COMPLEXITIES  OF  NATURAL  AND  COMPUTER  LANGUAGES 
M.L.  Shooman  and  A.E.  Laemmel 

There  is  a great  need  for  theoretical  models  which  describe  programs  and  allow 
us  to  quantitatively  estimate  complexity,  running  time,  storage  requirements,  and  de- 
velopment time.  In  addition  to  serving  as  an  estimate  during  the  initial  design  period, 
they  can  be  refined  as  the  program  develops  and  used  as  a management  and  analysis 
tool.  They  can  also  be  used  to  compare  initial  design  approaches,  programming  styles, 
different  algorithms,  etc.  Early  work  on  such  a theory  has  been  initiated.^  This  work 
discusses  the  linguistic  theory  (Zipf's  laws),  extends  these  to  programming  languages, 

and  develops  equations  for  program  length  based  on  these  principles.  This  work  is 

2 

similar  to  that  of  Halstead  which  is  commonly  known  as  "Software  Physics.  " 

There  are  many  similarities  between  natural  and  computer  languages,  and  we 
will  make  use  of  the  analogies  between  nouns  and  verbs  and  operands  and  operators. 

The  similarities  in  structure  and  content  of  natural  and  computer  language  are  further 
illustrated  via  the  following  thought  problem.  Suppose  we  take  a programmer  who  un- 
derstands a particular  computer  language  and  give  him  the  complete  listing,  comments, 
and  documentation  for  a computer  program.  We  instruct  him  to  study  the  computer 
program  until  he  understands  it  and  then  produce  a report  written  in  good  English  con- 
taining paragraphs,  complete  sentences,  and  algorithms  written  as  a sequence  of  steps 
in  good  English  without  mathematical  notation.  In  principle,  the  report  and  the  computer 
program  would  be  equivalent. 

A.  Zipf's  Law 

Before  we  discuss  Zipf's  law  it  is  convenient  to  introduce  a few  terms  in  dealing 
with  natural  language.  We  use  the  term  token  to  refer  to  all  the  words  of  the  written 
or  spoken  sample.  The  term  type  is  used  to  refer  to  the  vocabulary  of  words  in  the 
sample.  Much  of  our  efforts  will  be  centered  on  the  counting  of  the  number  of  times, 
n^,  particular  types  occur  in  a sample  of  n tokens  containing  t types.  The  most  fre- 
quently occurring  type  will  be  assigned  rank  r = 1,  the  second  most  frequent  type  rank 
r = 2,  and  the  least  frequent  type  rank  r = t.  Thus 

t 

y n = n . (1) 

r = l " 

The  absolute  frequency  of  occurrence  for  type  r is  n^;  however,  the  relative  frequency 
of  occurrence  f^  is  simply  n^/n. 

Zipf  studied  the  relationship  between  relative  frequency  of  occurrence  f and  rank 

3 

r for  words  from  English,  Chinese,  and  the  Latin  of  Flatus. 
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Careful  study  of  Zipf's  data  and  that  of  others  shows  that  f^  vs.  r plots  as  a 
straight  line  on  log-log  paper,  with  a unity  sl-jpe,  thus  we  arrive  at  the  simple  relation- 
ship called  Zipf's  law, 

^r  ■ ^ (2a) 

cn 

"r  = T (2b) 


Inspection  of  Eq.  (2a)  yields  the  fact  that  the  constant  c can  be  interpreted  as  the 
relative  frequency  of  the  rank  1 word  type  (also  the  intercept  with  the  r = 1 line). 

B.  Type  Token  Equation 


If  we  sum  both  sides  of  Eq.  (2b),  we  obtain,  using  Eq.  (1), 
n = n^  = cn  ^ • 

4 


r=l  * r=l 

The  summation  of  the  series  l/r  is  given  by 


(3) 


,1,7  = 0.5772 


(4) 


Substitution  from  Eq.  (4)  into  Eq.  (3)  (retaining  only  2 terms  for  modest  size  t)  yields 
n=  cn(0.5772  + fn  t)  (5 

We  can  eliminate  the  constant  c from  Eq.  (5)  by  considering  the  behavior  of  Eq. 
(2b)  for  the  smallest  rank  which  is  where  r = t (e.g.,  if  there  are  100  types,  then 
the  largest  rank  is  obviously  100).  In  most  cases  the  rarest  type  (largest  rank)  will 
occur  only  once,  thus,  Substituting  these  values  yields,  c = t/n,  which 

when  combined  with  Eq.  (5)  gives 


n = t(0.5772  + fn  t) 

C.  Summary  of  Experimental  Results 

The  results  to  date  have  shown  that  both  operators,  operands,  and  the  sum  of 
operators  plus  operands  rectify  fairly  well  with  a slope  of  unity  on  log-log  paper  (i.e., 
they  fit  Zipf's  law)  for; 

(1)  An  11  line  and  a 27  line  PL/l  program  (55  and  222  tokens) 

(2)  The  MIKBUG  machine  language  executive  program  for  the  M6800  micro- 
processor (322  tokens) 

(3)  Operators  in  PDP-11  assembly  language  programs  (1572  tokens) 

(4)  Variable  names  in  3 PL/l  programs  (320,  238,  and  193  tokens). 
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D.  Relationship  to  "Software  Physics" 

The  initial  motivation  for  the  application  of  Zipf's  law  to  computer  languages  came 

2 

from  a review  of  Halstead's  work  on  Software  Physics.  Early  in  his  work  he  arrives 
at  a formula  for  program  length 

L = Nj  log^  Nj  + N^  log^  (7) 

where 

L E Program  length 
Nj  E Number  of  distinct  operator  types 
N2  = Number  of  distinct  operand  types 

In  terms  of  our  notation  the  analogous  quantities  are 

t - Nj  + N^  (8) 

n = L 

Note  that  Eqs.  (7)  and  (6)  are  of  similar  form.  In  Ref.  1 we  compare  the  actual  number 
of  tokens  (counted)  with  the  number  of  tokens  calculated  using  both  Equations  (6)  and  (7). 
The  average  error  and  average  magnitude  error  are  computed,  and  both  equations  yield 
good  agreement  (10-20%),  between  actual  and  calculated  results. 

E.  Estimation  of  Program  Length  Early  in  Design 

One  method  of  initially  estimating  program  length  (token  length*)  is  to  estimate 
the  number  of  tokens.  We  assume  the  analyst  initially  has  a complete  description  of 
the  problem  and  that  a partial  analysis  and  choice  of  key  algorithms  has  been  made. 

An  elementary  approach  might  be  to  estimate  the  token  size  by 

(1)  Estimating  the  number  of  operator  types  which  will  be  used  in  the  language 
by  the  assigned  programmers. 

(2)  Estimate  the  number  of  input  variables,  output  variables,  intermediate 
variables,  and  constant  need. 

(3)  Sum  the  estimates  of  steps  (1)  and  (2)  and  substitute  in  Equation  (6). 

Clearly  we  might  consult  past  programs  written  by  the  assigned  programmers  for 
data  on  number  of  operator  types.  Also,  if  a large  program  will  be  stated  in  a compre- 
hensive specification  document,  written  in  English  or  in  a specific  alien  language.  This 


In  addition,  one  must  add  other  classes  of  statements  and  programming  elements 
such  as:  comments,  declares,  certain  assembler  directives,  etc. 
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should  name  an  input  and  output  variable,  thus  our  estimates  will  mainly  deal  with  pre- 
dicting the  number  of  intermediate  variables  and  constants. 
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EXPERIMENTAL  VERIFICATION  OF  DEBUGGING  MODELS 
D.L.  Baggi 

The  object  of  this  study  is  the  implementation  of  a so-called  driver  program  which, 
given  a program  to  test,  will  force  the  traversal  through  all  its  possible  paths.  The 
advantage  of  such  a procedure  is  obvious  for  programs  with  several  branches  and  de- 
cision points;  they  are  usually  debugged  by  laborious  construction  of  a data  set,  which 
hopefully  would  cause  exploration  of  all  paths.  The  method  described  here,  forces  ex- 
haustive testing  of  all  possible  paths,  with  no  need  for  the  design  of  a data  set. 

A.  Initial  Drivers 

The  first  effort  in  the  definition  of  the  driver  program  consisted  of  the  implemen- 
tation of  a program  capable  indeed  of  traversing  all  paths  of  a given  program.  This 
program  iteratively  substitutes,  in  place  of  the  condition  in  a PL/l  IF- statement,  a 
value  of  zero  or  one,  alternatively.  With  this  done  concurrently  for  all  conditional 
statements,  eventually  one  traverses  once  all  possible  paths.  It  was  realized^  that  the 
resulting  number  of  paths  of  2*^  for  conditional  branches,  is  merely  an  upper  bound  for 
all  possible  paths,  where  the  real  number  lies  in  fact,  between  n + 1 and  2*^.  Hence  the 
above  scheme  wastes  execution  time,  which  increases  exponentially  (for  example,  a 
program  with  fourteen  paths  may  require  8142  runs,  of  which  8128  are  meaningless'.). 
Thus  an  algorithm  had  to  be  designed  to  consider  the  possible  paths  only. 

B.  Present  Drivers 

The  present  algorithm  scanns  a PL/l  program.  It  searches  for  keywords  such 
as  IF,  THEN,  ELSE,  DO  and  END,  while  constructing  regular  expressions  of  zeros  and 
ones.  Resolution  of  this  regular  expression  yields  a set  of  binary  integers,  which 
represent  the  status  of  the  conditional  expressions  during  each  of  the  runs  through  all 
possible  paths.  In  fact,  each  bit  of  such  integers  represents  the  value  of  the  next  con- 
ditional expressions  met  during  execution,  hence  forcing  a zero  or  one-branch  accord- 
ingly. Since  those  integers  were  derived  from  the  very  structure  of  the  algorithm  of 
the  program,  they  represent  indeed  each  path;  as.  an  extra  bonus,  their  total  number  is 
the  number  of  possible  paths.  Hence  the  algorithm  enumerates  each  path,  uniquely 
describing  it  in  terms  of  its  branches,  and  also  counts  them. 

The  algorithm  works  as  follows: 

each  expression  is  binary;  it  contains  two  terms  separated  by  a + sign 

the  terms  can  be  only  1,0,  or  1 or  0 concatenated  with  a binary  expression. 


Scanning  rules: 
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(a)  each  IF  opens  a left  parenthesis,  ( 

(b)  each  THEN  corresponds  to  a 1 

(c)  each  ELSE  corresponds  to  a 0 

(d)  each  well  completed  binary  expression,  with  both  terms  completed, 
compels  closing  with  right  parentheses  at  its  level. 

Note;  The  algorithm  takes  care  of  missing  ELSE  clauses. 

C.  Examples 

IF  cond  THEN  stmt; 

ELSE  stmt; 

IF  cond  THEN  stmt; 

ELSE  stmt; 

IF  cond  THEN  stmt; 

ELSE  stmt; 

Result; 

(l+0)(l+0)(l+0) 
or  the  eight  paths 
111,011, 101,001,220,010, 100,000 

IF  cond  THEN  IF  cond  THEN  IF  cond  THEN  stmt; 

ELSE  stmt; 

ELSE  IF  cond  THEN  stmt; 

ELSE  IF  cond  THEN  IF  cond  THEN  stmt; 

ELSE  IF  cond  THEN  stmt; 

ELSE 

Result; 

(1(  1(1+0) + 0(1 +0))+0(l(l+0)  + 0(  1+0))) 
which  gives  the  eight  paths 
111, 110, 101, 100,011,010,001,000 

IF  cond  THEN  stmt; 

ELSE  IF  cond  THEN  stmt; 

ELSE  IF  cond  THEN  stmt;  J 
ELSE  stmt; 

Yields; 

(l+0(l+0(l+0(l+0)))) 
or 

1,01,001,0001,0000 
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IF  cond  THEN  DO;  IF  cond  THEN  DO;  IF  cond  THEN  stmt; 

ELSE  stmt; 

END; 

ELSE  stmt; 

END; 

ELSE  stmt; 


Gives: 


(l(l(l+0)  + 0)  + 0) 


i.e.  , 


111,  no,  10,0 


Because  of  the  recursive  nature  of  the  algorithm,  it  has  been  implemented  in  the 
language  LISP.  It  could  eventually  be  translated  in  PL/l.  In  the  meantime,  however, 
the  greatest  concern  still  lies  in  making  sure  that  such  an  approach,  and  such  an  algo- 
rithm, works  at  all;  hence  the  choice  of  LISP  for  fast  implementation.  No  attention 
was  given  so  far  to  repetitive  DO- groups,  which  were  appropriately  handled  by  the 
initial  crude  algorithm.  This  is  because  extension  of  this  algorithm  to  such  loops  is 
conceptually  very  trivial,  and  it  can  be  proved  that  the  system  would  perform  equally 
well;  but,  in  the  meantime,  such  extension  looks  very  time-consuming  and  would  by  no 
means  add  any  contribution  to  the  theory  of  this  debugging  model. 


D.  Direction  for  Further  Work 


The  long  range  idea  is  to  eventually  come  up  with  a complete  PL/l  package,  cap- 
able of  exploring  all  paths  of  a program,  paying  attention  to  IF- statements,  DO-loops, 
etc.  In  the  meantime,  however,  the  attention  is  given  to  producing  a tentative  system 
capable  of  showing  that  such  a project  is  indeed  possible.  To  this  end,  the  following  set 
of  programs  is  under  construction; 

(1)  A PL/1  program  which  reads  in  the  object  program  (the  one  under  debugging) 
and  translates  it  in  LISP-compatible  notation;  this  is  saved  on  a file. 

(Z)  A LISP  program  which  scans  the  object  program  and  constructs  the  regular 
expressions,  with  results  saved  on  a second  file. 

(3)  A PL/l  with  driver  program  which  reads  the  results  of  these  expressions 
and  forces  execution  of  the  object  program  through  all  its  possible  paths. 
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AUTOMATIC  PROGRAMMING  TECHNIQUES 
E.  Lipschitz 

Different  methods  and  techniques  for  writing  a better  software  package  are  cur- 
rently being  sought.  The  goal  is  to  write  programs  which  require  a shorter  testing 
and  debugging  time  to  achieve  a certain  degree  of  reliability. 


One  approach  is  automatic  programming.  The  use  of  pre- written  and  already 
tested  code  modules  reduces  the  number  of  bugs. 


A.  The  Program  "AUTO- PROGRAMMING" 

The  working  hypothesis  is  that  there  exists  a high  degree  of  commonality  among 
commercial  applications  which  can  be  exploited  to  automate  the  production  of  code, 
once  processing  and  output  specifications  are  defined. 

"AUTO- PROGRAMMING"  is  divided  into  two  parts  --  "Flow"  and  "Auto,  " both  of 
which  are  interactive  on-line  programs  that,  by  communicating  with  the  user,  generate 
his  programs . 

1 . The  Program  "Flow" 


"Flow"  receives  the  information  about  the  flow-chart  of  a program  from  the  user 
and  generates  it.  "Flow"  recognizes  only  four  different  types  of  blocks,  which  are  suf- 
ficient to  generate  any  flow-chart.  They  are: 


Type  No.  1: 
Type  No.  2: 
Type  No.  3: 
Type  No.  4: 


Control  block;  A conditional  decision  block,  similar  to  the 
statement  If  ( ).  Go  to  ( ). 


Functional  block;  This  block  will  perform  a complete  task 
selected  from  those  in  the  computer  library. 


Stop  block;  This  block  indicates  the  end  of  a path;  i.e.  , 
Stop  statement. 


User's  code  block:  The  user  inserts  the  code  he  wants  into 
this  block.  This  feature  is  used  whenever  the  libraru  does 
not  include  programs  for  the  needed  task. 


Upon  completing  the  flow-chart,  control  passes  to  "Auto."  "Auto"  will  generate  the 
code  for  blocks  Types  No.  2 and  No.  3,  while  the  user  will  generate  the  code  for  blocks 
Types  No.  1 and  No.  4. 


2.  The  Program  "Auto" 

A collection  of  code  modules  can  be  stored  as  a system  library.  The  desired 
program  can  be  achieved  by  concatenating  different  code  modules  from  the  system  li- 
brary with  code  generated  by  the  user. 

The  user  will  specify  to  "Auto"  what  he  would  like  to  do,  and  "Auto"  will  advise 
him  which  methods  are  available  for  the  solution,  as  well  as  their  characteristics.  The 
user  will  then  choose  the  method  he  prefers,  and  "Auto"  will  generate  the  needed  code. 


422 


SAFETY,  RELIABILITY  AND  SOFTWARE  ENGINEERING 


The  library  currently  contains  the  following  modules. 


1. 

2. 

3. 

4. 

5. 


6. 


7. 

8. 

9. 


10. 


11 . 


12. 


Linear  Search 
Binary  Search 
Interchange  Sort 
Shell  Sort 

Sin(x)  0 £ X £ 2it 

m , , .n  2n  + 1 

n=0  ' ' 

Cos(x)  - 2tt  ^ X < 2tt 


Cos(x)  = £ 


m , , .n  2n 
(-U  X 


n=0 

m 


(2n)'. 


‘-"I*'  = ^ L I 


X- 1 .2n+l 


n=0 
m n 


Exp(x)  = I Jr 
n=0 

Arctan(x)  for  |x|  < 


tan  ^(x) 


/ 1 

y L-i)  X 

2n+l 

n=0 


Bessel  Function  of  the  First  Kind  and  Zero  Order 
2 

m (-  —r~ ) 

“ n=0  (nl) 

Bessel  Function  of  the  First  Kind,  Orders  0,  1 and  2 


(- 


n=0  (n'.)^ 

2 

J Ji(x)  = Z 2-S 

n=0  (n'.)  • (n+  1) 

^ Jj(x)  - Jq(x) 

Modified  Bessel  Function  of  the  First  Kind,  Orders  0, 
2 

( ~r' ) 

■ I ^ 

n=0  (nl) 


1 and  2 


r 
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2 


n=0  (nl)  • (n+  1) 


l2(^)  = - t 


13.  Error  Function 

X 


-t 


;rf(x)  J e '■  dt  = 


.n  2n+l 


2***  / 1 *■ 
_ _ y (-1)  X 


■'o 

14.  Fresnel  Integral  C(x) 

• ^ .2n  , . 

X T nn  , , . n ( -r ) 4n  f 1 

C{x)  = ^ cos(|  t ) dt  = 

15.  Fresnel  Integral  ^^(x) 

, X , m , ,.n  2n  + l/2 

c,{x)=-i-  f ^^dt=-^  f ; 

2 I)  T 27  n=0 


16;  Fresnel  Integral  S(x) 


n(-5)^”  4n  + 3 


/•*  . . TT  .2.  .,  TT  (-1)”'2'  x'*' 

S(x)  = sMj  t ) dt  = 2 [r2n+l)':i'(4nT 


3) 


17.  Sine  Integral 


S.(x)  = / 
* /\ 


sin  t 


dt 


18. 


0 

Cosine  Integral 

X 


S.(x) 

m 


. , .n  2n+l 
(-n  X 


19.  Dilogarithm 


[(2n  + l);](2n+l) 
Cin(x) 

n ^2n 
[(2n)'.](2n) 

f(x) 


•^  / 1 .*-v  ii,*  y 14** 

r ( 1 - cos  t)  V 1-  1 ) X 

Cin(x)  = jf  d - tdt  = - 2^^  f(2n)'.l(2n 


f(x)  = -f 


1 


Ln  t 
(t-1) 


dt 


£ (-1)  (x-l) 
n=0  n 


Note:  m is  so  chosen  that  the  magnitude  of  the  m-th  term  of  the  power  series  is  less 
tha 
10- 


thar^or  equal  to  10"^,  while  the  magnitude  of  the  (m-l)-th  term  is  greater  than 


The  development  of  "AUTO- PROGRAMMING"  will  continue  to  concentrate  mainly 
on  increasing  the  size  of  the  library.  More  mathematical  programs,  as  well  as  some 
utility  programs  for  data  manipulations,  will  be  developed  in  the  near  future. 
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SMALL  SCALE  TESTS 
H.  Ruston  and  M.L.  Shooman 

In  order  to  verify  the  theoretical  models  four  programs  have  been  written  by  stu- 
dent programmers  and  careful  records  were  kept  on  their  debugging  experiences.  We 
will  describe  the  assigned  programs  and  the  error  reporting  form.  This  data  is  pres- 
ently being  reduced  and  the  resulting  conclusions  will  be  described  in  a future  report. 

A.  The  Programmers 

The  programmers  were  undergraduate  students  of  sophomore  junior  standing, 
with  high  interest  and  ability  in  programming  topics.  Because  of  this  selectivity  we 
believe  their  product  to  be  likely  the  one  of  equivalent  to  the  one  of  programmers  with 
intermediate  experience. 

Consequently,  we  consider  the  obtained  test  data  to  be  representative  of  normal 
normal  practice. 

B.  The  Instructions 

The  programmers  were  made  aware  of  the  importance  of  maintaining  careful  and 
truthful  records.  They  were  also  given  the  following  specific  instructions; 

(1)  The  problems  were  to  be  analyzed  and  coded,  with  both  analyses  and  coding 
times  recorded. 

(2)  The  resulting  program  was  to  be  corrected  of  just  the  syntax  errors.  Their 
number,  number  of  runs  needed  for  their  correction,  and  run  times  were  to 
be  recorded.  All  print- outs  were  to  be  saved  and  numbered. 

(3)  The  programs  were  then  presented  to  us  (i.e.,  M.  Shooman  and  H.  Ruston). 
We  planned  to  ask  the  program  author  and  other  members  of  the  group  to  de- 
bug each  copy  independently,  recording: 

a.  Number  of  bugs  and  types  found  in  each  debugging  shot 

b.  Analysis  time  and  computer  time  for  each  debugging  shot 

c.  History  of  removed  bugs  and  generated  bugs. 

(4)  If  a programmer  reached  a blind  alley  he  had  to  consult  with  us.  He  could 
not  ask  for  other  help,  or  abort  the  program. 

(5)  The  program  had  to  be  constructed  with  the  following  constraints: 

a.  To  be  structured 

b.  To  contain  no  impurities  (as  listed  in  Halstead^) 

c.  The  main  program  to  be  the  control  structure  with  calls  to 
modules  (i.e.,  blocks  or  procedures) 

d.  No  module  to  exceed  50  lines. 
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C.  The  Four  Programs 

Three  small  problems  (Problems  1,2  and  3)  and  one  medium  size  problem  were 
generated  for  the  small  scale  tests.  The  initial  write-ups  of  the  problems  for  the  de- 
sired four  programs  follow. 


Problem  No.  1 - Minimum  Salary  Payroll  Adjustment 

1.  Statement  of  the  Payroll  Adjustment 


Glen  Cove  University,  which  employs  200  faculty  members,  has  just  signed  a 
non- faculty  union  contract.  All  salaries  of  $ 16,000  or  higher  are  to  remain 
unchanged.  Any  faculty  member  who  earns  less  than  $ 16,000  per  year  is  to 
receive  a pay  raise  according  to  the  following  formula:  He  will  receive  $100 
per  year  additional  for  each  dependent  (including  himself),  plus  $50  per  year 
for  each  year  of  employment.  In  no  case  may  his  new  salary  exceed  $ 16,000 
per  year . 


The  personal  data  on  all  faculty  members  is  stored  on  magnetic  tape  in  the 
Business  Office  and  includes  present  annual  salary,  number  of  dependents, 
date  of  hire  and  other  information.  The  problem  is  to  write  a program  which 
computes  and  prints  out  the  list  of  faculty  members  along  with  their  old  and 
new  salary. 

Assume  that  the  Business  Office  will  give  you  a set  of  cards  with  the  data  for 
each  person  on  one  card.  You  should  create  your  own  test  data;  however, 
final  testing  of  your  program  will  be  done  on  the  actual  card  deck.  Assume 
the  following  arrays  will  accept  the  input  data  in  your  program: 


NAME  (200) 
DEPEND  (200)  ; 

DHIRE  (200) 

PRESS  SAL  (200): 


contains  a name  of  up  to  30  characters 

contains  number  of  dependents  in  two 
decimal  digits 

contains  a string  of  10  characters,  two 
digits  for  month,  a blank,  two  digits 
for  day,  a blank,  and  four  digits  for  year 

contains  5 digits  with  yearly  salary  in 
rounded  dollars. 


2.  Approaches  (write  two  programs,  using  each  approach): 

(a)  Search  list  of  names  for  those  making  less  than  $ 16,000  per  year, 
compute  new  salary,  check  $ 16,000  limit,  store  new  salary,  print 
output 

(b)  Sort  list  by  salary  from  lowest  to  highest,  stop  when  $16,000  is 
exceeded,  compute  new  salary,  check  $ 16,000  limit,  store  new 
salary,  print  output. 

3.  Language  and  Gomputer:  PLAGO  on  Poly  360/65. 

Problem  No.  2 - Finding  the  Roots  of  a Cubic  Equation 

1 . Statement  of  Problem 

3 2 

The  polynomial  equation  a3X  + a2X^  + ajx  + aQ  = 0 is  to  be  solved  for  its  three 
roots.  A general  solution  is  desired  which  will  work  for  any  finite  real  values 
of  a3,a2,aj  and  aQ.  The  values  of  a3,a2,aj,  and  ag  are  to  be  acquired  as 
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floating  (single  precision)  input  data  at  the  beginning  of  each  run.  Write  your 
program  with  a loop  so  it  reads  any  number  of  data  cards  and  terminates  on 
the  last  data  card.  Make  up  your  own  test  data;  however,  it  will  be  tested 
finally  with  supplied  data  cards. 

2.  Approaches 

(a)  The  general  formula  (i.e.,  Cardano's  formula)  for  the  solution  of  a 
cubic  is  to  be  used  to  compute  the  roots 

(b)  An  iterative  solution  for  a single  real  root  is  to  be  obtained.  Once 
the  real  root  is  removed,  the  quadratic  formula  is  to  be  used  to 
solve  for  the  other  two  roots. 

3.  Language  and  Computer:  PLAGO  on  Poly  360/65. 

Problem  No.  3 - Manipulation  of  a File  of  Research  Reports 

1 . Statement  of  the  Problem 

At  present  the  Reliability,  Safety,  and  Software  Engineering  Group  at  the  Poly- 
technic has  about  1,000  reports,  papers,  books,  journal  proceedings  in  its 
library.  Each  item  is  entered  on  an  index  card  in  a file  box.  It  is  anticipated 
that  the  library  may  eventually  grow  to  10,000  items  in  the  future.  In  the  near 
future,  two  punched  cards  will  be  created  for  each  of  the  items  in  the  library 
and  we  wish  to  create  a program  to  perform  various  searches,  sorts,  and  list- 
ings . 

Assume  that  each  punched  card  contains  4 character  fields.  The  first  field  is 
30  characters  wide  and  contains  the  author(s)  name(s).  The  second  field  is  50 
characters  wide  and  contains  the  full  or  abbreviated  title.  (No  important  words 
in  the  title  are  to  be  abbreviated.)  The  second  card  contains  field  3 which  is  50 
characters  wide  and  contains  key  words  (or  abbreviated)  key  words  in  the  item. 
The  last  field  is  30  characters  wide  and  contains  the  source  (journal,  issue, 
book  publisher,  proceedings,  etc.)  of  the  item. 

The  program  must  be  able  to  perform  the  following  tasks,  and  the  selection  (and 
possibly  sequence)  of  tasks  to  be  performed  must  be  controlled  by  the  first  data 
card  which  will  serve  as  a program  control  card.  Provide  a means  of  perform- 
ing tasks  on  the  same  run. 

(1)  Read  in  a variable  length  stack  of  item  cards  and  store  them 

(2)  Alphabetize  the  data  by  first  author 

(3)  Print  out  the  list  of  items 

(4)  Create  a list  of  key  words  (from  field  (3)),  eliminate  duplicates, 
alphabetize,  and  print  out 

(5)  Search  the  author  field  for  a given  author's  name,  and  print  out 
the  list  of  items  he  has  written: 

(6)  Search  the  key  word  field  for  items  which  contain  the  "intersection" 

(AND)  of  one,  two,  or  three  inputed  key  words 

(7)  Provide  the  same  search  facility  as  (6)  on  words  in  the  title  field. 

Programmer  should  create  his  own  test  cards,  and  final  testing  will  be  perform- 
ed by  a supplied  deck  of  item  cards. 
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2.  Approaches 

Programmer  should  provide  his  own  approaches. 

3,  Language  and  Gomputer:  PL  AGO  on  Poly  360/65.  | 

Problem  No.  4 - Specifications  for  Ballot  Counting  Procedure 

1 . Terms  | 

An  election  consists  of  sets  of  ballots  to  elect  persons  for  various  committees.  1 

Each  set  of  ballots  is  called  a committee  election.  There  may  be  up  to  10  com- 
mittee elections  for  an  election.  I 

All  ballots  for  a particular  committee  have  the  name  of  the  committee  punched 
in  columns  61-80  and  contain  the  names  of  the  candidates  to  that  committee. 

There  are  up  to  25  nominees  for  each  committee. 

When  a person  votes  for  a candidate,  an  "11"  punch  (i.e.,  a minus  sign)  is  punched  ] 

in  the  ballot  in  the  field  consisting  of  columns  21-45  corresponding  to  candidates  ’ 

1-25.  Such  a punch  is  a mark.  A particular  ballot  may  have  more  than  one  ] 

mark  since  a particular  committee  may  have  more  than  one  vacant  position  <i.e.,  * 

there  may  be  more  than  one  vote  allowed  to  each  voter  on  each  ballot). 

An  election  package  consists  of  one  ballot  for  each  committee.  Each  eligible 
voter  receives  one  and  only  one  election  package  which  he  marks  and  returns  for 
counting . 

2.  Specifications 

Prior  to  counting  the  ballots,  the  program  must  read  certain  preliminary  informa- 
tion concerning  the  election. 

For  each  committee  election  the  program  must  be  informed  as  to: 

(1)  The  name  of  the  committee 

(2)  The  number  of  candidates 

(3)  The  number  of  marks  permitted  (e.g.,  if  "vote  for  three"  then  three  marks 
are  allowed). 

A rough  flow  chart  is  shown  below. 

DETERMINE  VALIDITY 

(1)  Gheck  election  name  (cols.  61-80).  If  name  is  found  then 

(2)  Check  all  marks  (cols.  21-45)  to  see  if  all  are  or  ' '.  If  ok  then 

(3)  Check  that  no  mark  occurs  after  last  candidate. 

If  ok  then 

(4)  Count  number  of  marks.  If  total  is  less  than  or  equal  to  the  number  of 
marks  permitted,  then  ballot  is  valid. 


If  any  test  fails  then  ballot  is  invalid. 
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PROCESS  BALLOT 

Consists  of  tallying  the  marks  in  some  sort  of  array,  probably  25x  10. 

OUTPUT 

Printing  tallies. 

Typical  output  for  3 committees  named  PPG,  SAB,  TENURE 


CANDIDATE  NUMBER 

PPG 

SAB 

TENURE 

1 

5 

3 

9 

2 

2 

2 

2 

3 

6 

8 

3 

4 

7 

7 

5 

5 

9 

1 

8 

6 

1 

0 

7 

7 

4 

2 

8 

0 

5 

9 

0 

10 

8 

THERE  WERE  50  VALID  BALLOTS 

{ THERE  WERE  4 INVALID  BALLOTS  AND  THESE  ARE  REPRINTED  ON 

5 PREVIOUS  PAGE 

' If  there  are  no  invalid  ballots,  then  the  last  line  can  be  left  unprinted.  The 

i sample  output  has  10  rows  of  tallies  since  SAB  has  10  candidates.  Provision 

should  be  made  for  printing  any  number  of  rows  from  a minimum  of  2 to  a 
maximum  number  of  candidates  for  any  committee. 
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D.  The  Reporting  Form 

After  several  revisions  the  resulting  reporting  form  has  been  selected. 

POLYTECHNIC  INSTITUTE  OF  NEW  YORK 

Department  of  EE/EP 
Division  of  Computer  Science 
Safety,  Reliability  and  Software  Engineering  Group 

RADC  Contract  F3060E- 74- C-0294  (Softy) 

INDIVIDUAL  ERROR  REPORTING  FORM 
(This  must  be  completed  for  each  non- syntax  error.) 

1 .  Identification 

(a)  Programmer's  Name 

(b)  Program  Title 

(c)  Date 

(d)  Form  Number  (for  this  program) 

(e)  Description  of  the  error  (be  precise) 


( f)  Description  of  the  correction  (be  precise)_ 


2.  Means  of  Detection  ’ Corrections  (not  New  Reqs.) 

- More  than  one  category  may  be  •J'ed 


1 

a.  Hand  Processing 

- ■ 

d.  Interrupt  Error  (Code 

b.  Personal  Communication 

e.  Incorrect  Output  or  Result 

— 

c.  Infinite  Loop 

f.  Missing  Output 

■ ' 

g.  Other  - Explain 

3.  Effort  to  Diagnose  the  Error  - Do  not  include  effort  spent  in  initial  detection 

a.  No.  of  Runs  to  Diagnose Elapsed  Computer  Time  (Minutes) 

b.  Working  Time  to  Diagnose  Hours  

4.  Category  of  Change 

SOFTWARE  CHANGE  REQUIRED 

Nature  of  Change 

Documentation  (Preface  or  Comments) 

Fix  Instruction 

Change  Constants 

Structural 

Algorithmic 

Other  - Explain 
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Source  of  Bug 


Bug  essentially  unrelated  to  previous  corrections  (i.e.,  usual  case  of 

bug  just  discovered) 

Previous  correction  did  not  remove  the  believed  error  (i.e.,  improper 

or  incomplete  analysis) 

New  bug,  introduced  by  a previous  correction  (i.e.,  bug  generation 

through  a correction) 


Type  of  Bug 


Misinterpretation  of  Specifications 

Wrong  Specifications 

Incomplete  Specifications 

Incorrect  Sequencing  of  Computations 

Incorrect  Input  Data  (Type  and  Quantity) 

Incorrect  Expressions 

Incorrect  Declaration 

No  Defense  Against  Invalid  Data 


Operating  System 
Support  Software 
Card  Mispunched 
Other  - Explain 


Difficulty  of  Correction 

a.  No.  of  Runs  to  Correct 


Elapsed  Computer  Time  (Minutes) 

b.  Working  time  to  Debug:  Days Hours 

c.  No.  of  Cards:  Changed 


Added 


Deleted 


6.  Comments  (Use  Reverse  Side  and  Additional  Sheets  if  Necessary) 
E.  The  Present  Status  and  Planned  Activities 


Problems  1,  2 and  3 have  been  debugged  by  single  programmers,  that  is,  by  just 
their  authors.  A set  of  test  data  has  been  selected  for  problem  1,  and  the  several 
versions  of  the  first  program  have  been  exercised  with  this  test  data  successfully. 


It  is  planned  to  perform  the  additional  debugging  with  other  programmers  and  to 
use  the  test  data  for  the  experimental  small  scale  verification  of  our  theoretical  work. 


Rome  Air  Development  Center 
F30602-74-C-0294 


H.  Ruston  and  M.L.  Shooman 
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MICRO  RELIABILITY  MODELS 
M.L.  Shooman 

1 2 

Many  previous  software  reliability  prediction  models  by  this  author  and  others 

have  concentrated  on  the  bulk  (macro)  aspects  of  a program.  This  work  involves  a 

3 

newly  developed  micro  model  which  is  based  on  program  structure. 

It  is  assumed  that  the  program  has  been  written  in  structured  or  modular  form 
so  that  decomposition  into  its  constituent  parts  is  simple.  Further,  we  assume  that 
via  analysis  of  the  program  the  decomposition  can  be  related  to  several  paths  or  other 
functional  structures  within  the  program. 

The  model  is  constructed  based  upon  the  frequencies  with  which  each  of  the  j paths 
are  run,  (f.),  the  running  time  of  each  path,  (L),  and  the  probability  of  error  along 
each  path,  (q^) 

Several  methods  of  calculating  or  measuring  the  L,  L and  q^  parameters  are  sug- 
gested. In  fact  it  is  possible  to  use  one  technique  (historical  data)  to  produce  crude 
estimates  at  the  start  of  the  design,  and  refine  the  estimates  with  more  accurate  values 
as  the  design  progresses.  Given  the  existence  of  such  a model,  we  can  consider  the 
application  of  three  important  design  techniques  which  are  impossible  with  a macro- 
scopic model: 

(1)  Apportionment  of  the  software  reliability  (or  mean  time  to  failure)  specifica- 
tion among  the  subsystems  so  each  design  team  has  their  own  goal  to  meet. 
The  apportionment  is  obviously  done  so  that  the  subsystem  reliabilities 
combine  to  yield  a system  reliability  which  meets  system  specifications. 

(2)  If  design  proceeds  either  bottom  up  or  top  down,  eventually  there  is  a sys- 
tem integration  phase  where  all  parts  are  put  together  and  tried  out.  The 
macroscopic  models  developed  previously  could  not  be  applied  before  the 
system  reached  the  integration  stage.  However,  the  new  microscopic  model 
proposed  can  be  used  to  combine  the  results  of  the  module  development 
phases  to  predict  a preliminary  software  reliability  index  before  the  system 
integration  phase. 

(3)  The  microscopic  model  is  based  upon  measurements  made  on  the  software 
design.  Such  measurements  and  analyses  performed  on  the  software  lead  to 
a more  disciplined  design  and  provide  insight  into  how  the  module  perform- 
ance relates  to  overall  software  system  performance. 


Micro  Decomposition  Model 


The  micro  decomposition  model  which  will  be  proposed  in  this  section  is  based 
upon  several  assumptions.  We  first  assume  that  the  program  has  been  designed  using 
a structured  or  modular  philosophy  and  as  a result  there  emerges  a natural  structure 
of  the  program  which  can  be  described  as  consisting  of  a number  of  paths,  cases,  parts, 
modules,  or  subprograms.  The  decomposition  focuses  about  this  natural  structure. 

In  general  we  will  primarily  use  the  term  paths  from  now  on  to  designate  the  paths. 
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cases,  parts,  modules,  subprograms,  or  any  other  important  substructure.  We  also 
assume  that  the  majority  of  the  paths  are  independent  of  each  other.  (One  could  prob- 
ably tolerate  some  type  of  dependence  in  the  model  if  it  were  limited.) 

The  decomposition  model  will  be  developed  from  the  probablistic  viewpoint  of 
relative-frequency.  We  will  hypothesize  a sequence  of  tests  which  either  uncover  a 
bug  (failure)  or  run  to  completion  without  uncovering  a bug  (success).  We  begin  our 
development  of  the  model  by  defining  the  following  variables  and  parameters: 

N E The  number  of  tests 

i = The  number  of  software  paths  (cases,  parts,  modules,  etc.) 

t.  E Time  to  run  case  i (if  ti^e  is  not  deterministic  we  can  substitute  the 

mean  value  of  t.,  i.e.,  t.) 

1 1 

q.  E Probability  of  error  on  each  run  of  case  i (the  probability  of  no  error 
‘ p.  = 1 - q.) 

■ I 1 

r E Frequency  with  which  case  i is  run 

n^  E Total  number  of  failures  in  N tests 

H E Total  cumulative  test  time  in  hours. 


Note  that  in  the  above  set  of  definitions  we  have  defined  N as  the  number  of  tests.  Thus, 
we  are  modeling  actual  or  simulated  operation  by  a succession  of  N tests  (path  traversals) 
of  the  system.  We  also  assume  the  input  data  varies  on  each  traversal.  This  is  the 
reason  why  we  have  assigned  a constant  as  the  probability  of  encountering  a bug  on  each 
run,  q. . 

If  there  were  no  variation  in  input  parameters,  and  three  successive  tests  each 
traversed  path  j,  then  the  probability  of  encountering  an  error  on  the  first  trial  would 
be  q^.  The  conditional  probability  of  encountering  a bug  on  the  second  traversal  of  the 
same  path  with  the  same  parameters  is  unity.  Similarly,  the  probability  on  the  same 
path  with  the  same  parameters  the  third  time  is  also  unity.  Thus,  the  probability  of  a 
bug  on  three  traversals  of  path  j is  q^  x 1 x 1 = *1  j • 

Since  we  have  assumed  a variation  in  parameters  on  each  run  in  our  model,  each 
test  is  independent,  then  the  probability  of  encountering  one  bug  on  three  successive 

3 

traversals  of  path  j is  given  by  the  binomial  distribution  as 


P(1  error  in  three  trials)  = 


3 

1 


OjH-qil 


(1) 


Similarly,  the  expected  number  of  occurrences  in  a probabilistic  process  governed  by 
a binomial  distribution  is 


Number  of  Occurrences  = N 

q 


(2) 


where  N is  the  number  of  trials  and  q the  probability  of  occurrence. 
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B.  Development  of  the  Model 

We  can  now  compute  the  total  number  of  failures  n^  in  N tests.  The  tests  are  dis- 
tributed along  each  path  such  that  Nfj  tests  traverse  path  1,  Nf^  tests  traverse  path  2, 
etc.  Thus,  successive  application  of  Eq . (2)  to  each  of  the  i paths  yields  for  the  num- 
ber of  failures  in  N tests. 


Hf  Nf^q^  + Nf2q2  + 


Nf.q. 

I I 


(3) 


We  can  now  compute  the  system  probability  of  failure  on  any  one  test  run,  q^,  by  taking 
the  ration  of  n^/N  as  N approaches  infinity 


q 


o 


lim 

N^oo 


N 


(4) 


Similarly  we  can  compute  the  system  failure  rate,  z^,  by  first  computing  the  total 
number  of  test  hours.  First  we  compute  the  total  number  of  traversals  of  path  i as  NL 
as  was  previously  done.  Out  of  these  traversals  the  fraction  p^  will  be  successful  and 
will  accumulate  NxLxp^^xtj^  hours  of  successful  operation.  If  we  assume  that  the  time 
to  failure  distribution  for  the  Nf^q^^  traversals  which  result  in  failure,  is  rectangular, 
then  each  trial  which  results  in  failure  runs  t^/2  hours  on  the  average  before  failure. 
Thus,  the  total  test  time  accumulated  in  N runs  is  given  by 


H Nf^Pjt^  + Nf^qj  2 + ^^2*^2  2 


^ *5- 

+ • • • Nf.p.t.  + Nf.q.  -r  = N y , f.t.(p.  + 

1 1^1  2 -Ml  11  2 ' 

J = 1 


(5) 


Substitution  for  p^^  = 1 - q^  in  Eq.  (5)  and  simplification  yields 


H = N 


f.t.  (1 
1 1 


(6) 


We  now  compute  the  system  failure  rate  as 


z = lim  ~ 
° N^oo  H 


(7) 


and  substitution  from  Eqs.  (3)  and  (6)  into  Eq.  (7)  yields  in  the  limit 


o L q: 

^ f.(l  - ^)t. 

J ^ J 


(8) 


»4 
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C.  Special  Cases 

We  now  wish  to  examine  Eqs.  (4)  and  (8)  under  special  constraints.  These  are 

listed  in  Table  I.  Note  that  the  units  of  z are  clearly  seen  from  Case  4 to  be  failures 
- 1 ° 

per  hour,  or  just  hr. 

TABLE  I.  System  probability  of  failure  and 
failure  rule  for  special  cases. 
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D.  Measurement  of  Parameters 

In  order  to  implement  the  model  developed  in  the  previous  sections  we  must  de- 
velop numerical  values  for  the  sets  of  parameters  L,  and  L.  Cf  course,  in  keeping 
with  the  concept  of  structured  programming  and  levels  of  structures  within  levels,  one 
could  merely  state  that  we  continue  decomposition  to  lower  levels  until  we  end  up  with 
a pew  set  fj,,  qj,  and  tj,  parameters  at  a lower  level.  Clearly,  the  answer  to  the  ques- 
tion how  we  could  measure  or  estimate  our  parameters  at  a higher  level  is  also,  by  and 
large,  an  answer  to  how  we  would  do  the  measurement  at  a lower  level. 

The  parameter  sets  f.  and  L are  related  to  the  structure,  size  and  complexity  of 
the  control  structure  and  program  modules.  The  determination  of  the  f.  can  be  made 
by  a study  of  the  physical  meaning  of  the  paths  and  the  distributions  of  input  parameters 
which  drive  one  along  the  program  paths.  If  the  program  is  complex,  or  there  is  really 
no  information  on  input  statistics,  we  can  take  one  of  two  approaches.  Assume  f.  has 
a uniform  distribution  (see  Case  2 of  Table  I)  or  insert  counters  in  the  various  paths, 
and  experimentally  determine  the  L.  The  experimental  approach  requires  that  the 
program  be  in  reasonably  good  shape  so  that  a simulated  test  program  can  be  run. 
Clearly,  if  a counter  c^  is  placed  in  each  path  such  that  it  registers  one  count  for  each 
path  traversal,  and  we  run  N tests  then  f.  ^c^/N. 

The  set  of  L parameters  can  also  be  either  calculated  or  measured.  If  the  pro- 
gram is  written  in  assembly,  machine  or  microprogramming  code,  one  can  estimate 
quite  closely  the  run  time  of  a sequence  of  code  by  summing  the  operating  times  of  each 
instruction.  If  the  program  is  complex,  one  can  write  an  analysis  program  to  read  the 
code  and  perform  the  time  analysis  to  determine  t..  In  the  case  of  a higher  level  lan- 
guage, (FORTRAN,  PL/1,  COBOL,  etc.)  the  analysis  is  more  complex,  because  each 
statement  may  expand  into  one  to  say  ten  machine  language  statements.  Several  ap- 
proaches are  possible.  First  of  all,  one  can  obtain  a core  dump  of  the  machine  lan- 
guage program  and  proceed  as  has  been  described.  Another  alternative  is  to  insert  a 
block  of  higher  level  code  inside  DO  I = 1 to  K loop.  The  loop  is  run  for  a particular 
value  of  K and  the  C.P.U.  time  of  the  computer  recorded.  The  value  of  K is  changed 
and  another  run  and  value  of  C.P.U.  time  is  recorded.  With  about  3 values  of  C.P.U. 
time  vs.  K an  accurate  enough  straight  line  or  polynomial  model  of  run  time  vs.  K can 
be  fixed  to  the  data.  One  can  then  use  the  formula  to  predict  the  run  time  of  the  actual 


It  is  necessary  to  take  several  measurements  for  two  reasons.  First  of  all  there 
is  program  overhead  which  may  vary  from  run  to  run  depending  on  the  operating  sys- 
tem. (Also  there  is  DO  loop  overhead.)  Second,  the  recording  of  C.P.U.  time  is  not 
accurate  for  short  run  times.  To  correct  for  DO  loop  overhead  and  also  system  over- 
head, one  can  perform  the  measurement  with  and  without  the  code  block  in  the  loop 
and  work  with  the  difference. 
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code  block  by  substituting  the  number  of  repetitions.  (True  value  of  K.)  Of  course,  if 
the  program  and  a simulation  is  available  one  can  merely  run  several  test  runs  for 
each  path,  record  the  times  and  use  average  values  for  each  path. 

The  estimation  of  the  parameters  is  somewhat  more  difficult.  During  the  early 
stages  of  design  or  development  one  can  try  and  estimate  using  historical  data.  One 
way  to  derive  the  q.  parameter  is  to  obtain  failure  rate  data  on  the  program  and  equate 
it  to  using  the  assumptions  of  Cases  3 or  4 in  Table  I,  and  solve  for  q..  This  process 
can  be  repeated  as  the  program  is  written  and  better  values  for  q^  determined.  There 
is  a possibility  that  one  could  calculate  q^  from  a more  basic  procedure.  Knuth  has 
shown  that  most  FORTRAN  statements  are  relatively  simple  and  fall  into  one  of  several 
classes.  If  each  of  these  classes  also  has  a characteristic  error  rate,  then  by  analysis 
of  the  qj  values  for  several  examples,  we  should  be  able  to  derive  characteristic  values 
for  the  q^  parameters. 

E.  Conclusions 

The  models  developed  above  allows  one  to  decompose  a program  into  a number 
of  modules,  paths,  modes,  or  other  functional  entities.  One  can  then  compute  an  ex- 
pression for  the  software  failure  rate  in  terms  of  probabilistic  and  deterministic  para- 
meters which  can  be  estimated  from  historical  data  or  determined  by  analysis  or  experi- 
ment. The  model  provides  a clear  cut  procedure  for  relating  the  reliability  of  a large 
software  system  to  the  reliability  of  its  constituent  parts.  The  model  is  presently  being 
applied  to  a number  of  modest  size  problems  in  order  to  obtain  typical  parameter  values 
and  validate  the  model. 

Rome  Air  Development  Center 

F30602-74-C-0294  M.L.  Shooman 
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NONLINEAR  CONTROL  OF  LINEAR  MULTIVARIABLE  SYSTEMS 
L.  Shaw 

It  is  well  known  that  there  are  advantages  in  using  nonlinear  controllers  for  linear 
systems.  Such  controllers  can  achieve  fast  response  to  large  errors,  while  respond- 
ing slowly  to  small  deviations  which  are  often  largely  due  to  sensor  noise  contributions. 
While  there  have  been  some  applications  of  this  approach  to  single- input  single- output 
systems,  little  is  available  in  the  way  of  general  multivariable  theory. 

We  have  begun  to  develop  an  approach  to  this  problem,  starting  from  the  theory 
of  "linear  system  optimization  with  a prescribed  degree  of  stability"^  which  constructs 
the  optimal  controller  u(x,  t)  for  the  system 

X = Ax  + Bu  (1) 

and  the  criterion 
oo 

J = / (x'Qi  +u'Ru)e  °^*dt  . (2) 

o 

Here,  A,  B,  Q,  R,  are  constant  matrices,  (A,  B)  controllable,  R > 0,  Q>0.  Large  values 
of  the  parameter  or  > 0 force  faster  settling  of  an  initial  x(0)  / 0.  It  was  shown  in  Ref. 

2 that  if  or  is  sufficiently  big,  such  that  all  eigenvalues  of  (A  + a I)  satisfy 

Re{\.(A  +1)}  > 0,  then  a solution  exists  in  the  limiting  case  lim  J = J.  This  case  is 

Q-0 

interesting  for  two  reasons.  First,  since  o’  is  a design  parameter  which  provides  for 
adjustment  of  settling  time,  one  could  argue  that  Q is  unnecessary,  (The  restriction  on 
the  range  of  a values  prevents  a trivial  solution  of  u = 0_.)  Second,  the  limiting  control 
law  is  in  the  linear  feedback  form 

u(t)  = -R‘^B'P'^a)x(t)  (3) 

where  P is  the  unique  positive  definite  solution  of  the  Lyapunov  equation 

P{A +a  I)'  + (A I)P  = BR'^B'  . (4) 

This  is  in  contrast  to  the  necessity  for  solving  a Riccati  equation  when  Q 0. 

A nonlinear  controller  can  be  formed  from  the  linear  one  above  by  making  a a 
function  o(x),  such  that  a is  bigger  for  "bigger"  x.  To  this  end,  we  measure  the  size 
of  X in  terms  of  a Lyapunov  function  V = x'  P ^(ajx.  (It  is  straightforward  to  show  that 
this  is  a Lyapunov  function  for  the  system  in  Eq.  (1)  using  the  controller  in  Eq,  (3)  for 
a fixed  value  of  or  . ) We  make  cy(x)  an  "increasing"  function  of  x by  the  implicit  definit- 
ion 

V(x,a)=F(a)  (5) 
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where  f(a)  is  some  increasing  function  of  a.  V(x,a)  is  also  an  increasing  function  since 
differentiation  of  Eq.  (4)  with  respect  to  a shows  dP/da  to  satisfy 

g(A+I)'  MA+I)ff  = 2P  , (6) 

and  therefore  to  be  negative  definite  so  that 

= x'  X = -x’ P'^  ^ P' ‘ X > 0 . (7) 

da  — da  — — da  — 

It  remains  to  show  that  a function  F{a)  can  be  found  such  that  the  constraint,  Eq . 
(5),  yields  a unique  a for  each  x,  and  that  the  system  using  this  a(x)  in  Eq.  (3)  is 
asymptotically  stable.  A class  of  suitable  functions  has  been  found  to  work  in  several 
examples,  but  the  existence  of  such  stable  nonlinear  controllers  has  not  yet  been 
demonstrated  for  all  systems  of  the  form  Equation  (1). 

It  is  evident  that 

V = -x-p-l[2aP+BR-'B']  P’  i ff  (8) 

We  assume  that  A has  at  least  one  stable  mode  so  that  the  condition  on  eigenvalues  of 
(A  +a  I)  assures  that  a > 0 and  that  the  term  in  brackets  in  Eq.  (8)  is  positive  definite. 
For  the  other  term,  we  note 

da  d a d F d a d V (9) 

at  ■ aF  at  ■ aF  at 

where  the  constraint  in  Eq.  (5)  has  been  invoked  and  the  inverse  function  aCF)  is  well- 
defined  due  to  the  monotonicity  of  F(a).  Thus  Eq.  (8)  becomes 


V[1  = -x'P'HzaP  +BR‘^B']  p'^x 


Using  Eq.  (7)  it  is  clear  that  V < 0 for  all  x / 0,  and  Fla)  is  monotone  increasing,  if  and 
only  if 

IF  a_v  (11) 

da  da 

when  a(x)  is  the  solution  of  Eq.  (5)  for  any  x.  It  is  noteworthy  that  this  condition  also 
assures  at  most  one  solution  to  Equation  (5). 

A natural  choice  for  F(q')  is 
F(a)  = [h' P‘^o)h]*' 

for  a fixed  vector  h and  a constant  k.  We  choose  h such  that  where 

(-a  ) is  the  smallest  (most  negative)  eigenvalue  of  A - - assumed  temporarily  to  be 

' min 
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real  for  simplicity.  The  existence  of  such  an  h follows  from  multiplication  of  Eq.  (6) 
on  the  left  and  right,  respectively,  by  h'  P ^ and  its  transpose  to  get 

h'  — (P'^AP+a  . I)'  h + h' (P”  ^AP + a • I)  ^-T — h=-2h'P'^h  (13) 

Since  (P  ^AP  eigenvalue  of  zero,  we  choose  Ji  to  be  the  corresponding 

eigenvector  and  find  h'  P most  negative  eigenvalue  of  A is  complex 

valued  we  use  a complex  a in  Eq.  (13),  with  P a function  of  Re(a),  transposes  replaced 
by  conjugate-transposes,  and  h a complex  valued  vector.)  This  choice  of  h leads  to  F 
and  V functions  of  the  form  shown  in  Figure  I.  Since  the  slope  of  F can  be  increased 


F(a) 


Fig.  1.  Shapes  of  functions  in  Equation  (5). 

by  increasing  k,  we  can  hope  that  a sufficiently  large  k will  insure  the  existence  of  a 
solution  to  Eq.  (5)  as  well  as  satisfaction  of  the  stability  condition.  Equation  (11). 

When  the  F in  Eq.  (12)  is  used,  the  stability  condition,  Eq.  (11),  becomes 

> 0 (14) 

for  all  X.  Positive  definiteness  of  the  matrix  in  brackets  is  equivalent  to  that  of 


dP 


da 


+ P 


‘kf  ^y<k' 


P’  ^h) 


[-  A + kl  h'  Ih'  P'  ^h)]  > 0 


(15) 


where  a is  a diagonal  matrix  whose  entries  are  the  roots  of  the  determinental  equation 


(16) 


Given  a system  defined  by  A,  B,  R,  it  is  necessary  to  find  a k such  that  the  scalar 
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multiplying  I in  Eq.  (15)  is  greater  than  max  from  Eq.  (16)  for  all  a.  Some  examples 
illustrate  the  utility  of  this  approach. 

(1)  x = -ax+u;  a>0;  R = 1 

Here  any  h can  be  used.  With  h = I,  the  system  is  stable  for  k > 1 ; and  when 
k = 2,  u = -(x  )x. 


(2)  A = 


Here  we  find  h'  = c(l,  -4)  and  the  system  is  stable  if  k > (2  +J^)/3  = 1.138. 

With  k = 2 and  c = 1 

u = '(4o'  - 10)(q' Xj  + x^) 

where  ^(x)  is  the  unique  real  solution  (>  4)  of 

(4a  - 10)(2a^-  13a  + 20)^  = Xj(2a^-  5a  + 4)  + ZaXjX^  + 

Figure  2 shows  the  response  of  this  system  --  uncontrolled,  with  linear  con- 
trol, and  with  nonlinear  control,  for  two  different  initial  conditions. 

Although  the  generality  of  this  method  remains  to  be  demonstrated,  it  appears  to 
be  a first  step  toward  the  design  of  nonlinear  controllers  for  multivariable  systems. 

If  a suitable  k exists  for  a given  system,  then  there  will  be  a minimum  suitable  k.  (See 
Equation  (15). ) Use  of  that  minimum  k will  yield  the  "most  nonlinear"  system  because 
larger  k will  make  F(a)  increase  very  steeply  when  F(a)  > 1,  and  will  make  all  large 
values  of  x produce  nearly  the  same  a . Conversely,  a large  k will  effectively  lead  to  a 
smooth  transition  between  two  linear  systems  (a  = “rnin  other  linear 

system  for  small  x). 

Implementation  of  the  controllers  described  here  requires  solution  of  a polynomial 
equation  at  each  sampling  time.  While  the  degree  of  this  equation  will  generally  in- 
crease as  the  system  dimension  increases,  such  calculations  should  be  possible  with 
present-day  digital  controllers.  Indeed,  feasibility  of  sucn  a controller  is  one  of  the 
main  advantages  of  digital  versus  analog  controllers. 

Use  of  a decreasing  F(a)  in  Eq.  (5)  will  clearly  make  V < 0 in  Equation  (10).  This 
corresponds  to  a kind  of  nonlinear  system  which  puts  a hard  constraint  on  the  control 
amplitude. 

Arguments  parallel  to  the  ones  given  here  yield  similar  results  for  receding  hori- 
zon control  which  always  attempts  to  force  x{t  +T)  = 0.  Nonlinear  controls  are  pro- 
duced there  by  making  the  horizon  distance  T a function  T(x)  of  the  current  state. 
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Fig.  2,  Linear  vs.  nonlinear  control. 

Example  No.  2. 

REFERENCES 

. B.D.O.  Anderson  and  J.B.  Moore,  "Linear  System  Optimization  with  a Prescribed 
Degree  of  Stability,"  Proc.  lEE,  Vol.  116,  No.  12,  pp.  2083-2087  (December  1969). 

, D.  Sarlat  and  Y.  Thomas,  "Optimal  Regulator  with  a Prescribed  Degree  of  Stability; 
A Limiting  Case,  " Electron.  Lett.,  Vol,  11,  No.  17,  pp.  41 1- 412  (August  1975). 

. Y.  Thomas,  D.  Sarlat  and  L.  Shaw,  "A  Receding  Horizon  Approach  to  the  Synthesis 
of  Nonlinear  Multivariable  Regulators,"  Electron.  Lett.  (1977). 


442 


SYSTEMS,  CONTROL  AND  NETWORKS 


OPTIMAL  REPLACEMENT  WHEN  OBSERVABLE  STAGES  OF  DETERIORATION 
HAVE  MULTIVARIATE  EXPONENTIALLY  DISTRIBUTED  DURATIONS 

L.  Shaw,  C-L.  Hsu  and  S.  G.  Tyan 

This  project  continued  our  work  on  the  use  of  multivariate  exponential  distribu- 
tions in  reliability  applications.  Details  appear  in  a thesis^  and  a paper  which  will 
appear  soon.  We  give  here  only  a brief  formulation  of  the  problem  and  some  of  the 
general  properties  of  the  results. 

We  exploit  a frequently  used  model  which  describes  component  deterioration  as 
the  passage  through  of  a sequence  of  random- duration  intervals,  with  the  quality  in 
each  interval  constant,  but  at  a level  less  than  that  in  the  previous  interval.  The  qual- 
ity progresses  through  N+1  levels  from  k = 0 for  a new  part  to  k = N for  a worthless 

4 

one.  We  have  generalized  the  work  of  Luss,  where  he  used  independent  exponential 
durations  in  each  stage,  to  allow  a special  class  of  correlated  exponential  variable  to 
describe  the  durations. 


We  assume  that  transitions  from  deterioration  state  k = i to  k = i+1  are  imme- 
diately observable:  replacement  (setting  k = 0)  is  mandatory  when  k = N;  rewards  are 
decreasing  functions  of  time;  and  the  desired  replacement  rule  must  maximize  the 
average  reward  per  unit  time.  Three  kinds  of  reward  structures  were  examined  for 
a system  in  state  k = i: 


(1) 

(2) 

(3) 


Linear:  c^  dollars  per  unit  time 


> c 


N 


Quadratic:  c^t  dollars  per  unit  time  after  t seconds  in  state  1: 


> c 


N 


Linear  after  set-up  delay:  A readjustment  interval  of  t^  time  units  follows 
the  arrival  in  state  k = i.  The  amount  received  is  zero  during  that  interval, 
but  accrues  at  rate  c.  afterwards. 

i 

The  optimal  rule  for  the  linear  case  makes  no  use  of  the  durations  r^  in  states 
k = i.  It  simply  requires  replacement  when  entering  a state  k whose  value  depends 
on  the  c.  and  on  the  mean  state  durations. 

i 

In  the  other  two  cases,  dynamic  programming  arguments  show  that  the  correla- 
tion between  stage  durations  makes  knowledge  of  these  durations  useful  information 
for  the  optimal  controller.  There  is  a set  of  optimal  decision  thresholds  rj^  j such 
that  replacement  is  made  on  entering  state  k if  rj^  j < rj^  j.  A small  r^^  j indicates 
that  the  current  rate  of  deterioration  is  great.  Furthermore,  the  optimal  thresholds 

are  ordered:  =■'=  * 

“■k-l  - ’’k 

These  results  followed  from  a study  of  the  properties  of  the  conditional  densities 
of  the  r^  sequence.  In  particular,  that  sequence  preserves  the  Markov  property  of 
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the  gaussian  sequences  which  generate  the  correlated  exponentially  distributed  vari- 
ables, and  the  conditional  variables  are  stochastically  increasing  in  the  sense  that  the 
conditional  distribution  function  F(rj^/rj^  ^ is  an  increasing  function  of  ^ for  every 
rj^.  Moreover,  that  conditional  density,  expressible  in  terms  of  Bessel  functions,  is 
totally  positive  for  all  orders  (TP^)  and  it  defines  conditional  expectations  which  are 
convexity  preserving. 
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ESTIMATION  AND  MODELLING  OF  NON- STATIONARY  TIME  SERIES 
F.  Kozin 

A.  Introduction 

The  object  of  this  report  is  to  present  some  ideas  and  recent  results  that  we  have 
obtained  motivated  by  the  study  of  non- stationary  time  series  such  as  generated  by 
strong  motion  earthquake  records,  and  other  geophysical  phenomena. 

We  have  approached  this  problem  statistically.  That  is,  given  the  data,  fit  a 
model.  As  yet,  we  have  not  based  our  studies  upon  the  continuum  mechanics  of  layered 
media,  or  other  possible  physical  models  or  mechanisms.  Naturally,  for  any  model- 
ling approach  to  be  viable,  it  would  have  to  be  tested  relative  to  known  physical  mech- 
anisms . 

Typical  records  of  data  that  we  have  analyzed  are  shown  in  Figures  1 and  2.  This 
data  is  taken  from  reports  and  tapes  available  from  the  Earthquake  Engineering  Re- 
search Laboratories  of  Cal  Tech. 


ACTUAL  DATA 


ACTUAL  DATA 


We  have  been  concerned  with  acceleration  data.  However,  it  is  clear  that  veloc 
ity  or  displacement  data  could  be  treated  in  the  same  fashion.  Strong  motion  earth- 
quake records  are  typically  non- stationary  in  both  amplitude  as  well  as  frequency 
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content.  These  characteristics  are  easily  seen  in  Figures  1 and  2.  Thus,  traditional 
stationary  statistical  tools  such  as  covariance  and  power  spectral  density  analysis  do 
not  apply. 

Purely  on  the  basis  of  modelling  of  non- stationary  time  series,  we  divide  the 
problem  into  three  parts; 

Postulate  a class  of  models  for  the  data. 

Estimate  the  unknown  parameters  of  the  model  from  the  data. 

Assess  the  quality  or  fit  of  the  model  to  the  given  data. 

In  this  report,  we  shall  treat  each  part  separately  indicating  some  of  our  pitfalls, 
as  well  as  some  of  our  positive  results.  Our  initial  studies  of  this  problem  can  be 
found  in  References  1 and  2. 

B.  A Class  of  Models  for  Non- Stationary  Time  Series 
The  model  that  we  postulate  for  the  data  is 


y(k)  + aj(k  - 1)  y(k  - 1)  + 


+ a^  {k  - i ) y(k  - f ) = g(k)  u(k)  , 


where  y(k)  represents  the  time  series  at  "time"  k,  and  {u(k)}  is  a white  Guassian  se- 
quence. The  variance  of  u(k)  will  depend  upon  the  time  interval  between  data  points. 
The  coefficients  a^(k),  i=  1,  f as  well  as  the  coefficient  g(k)  are  time  varying  and  are 
to  be  obtained  through  the  data. 

Clearly  the  a^(k)  cannot  be  completely  arbitrary.  Therefore,  we  parametrize  the 
coefficients  by  assuming  that  they  are  of  the  form. 


.j(k)  = I .y  f. 


(k)  . i = l. 


where  m is  prechosen,  and  the  fj(k),  i=  1,  • • - .m  are  a suitable  family  of  known  functions. 
In  order  to  give  the  a.(k)  the  freedom  to  fit  a wide  class  of  functions,  we  chose 

*■  4 

the  fj(k)  to  be  the  orthogonal  family  of  discrete  Legendre  polynomials,  defined  as 


J s=0  ® ® n'®' 


j = 0,  1, 2,  • • • , N where  k^®^  =k(k  - !)•••  (k-s  + 1),  similarly  for  N^®^,  and  N represents 
the  number  of  data  points.  This  family  of  polynomials  is  orthogonal  on  the  interval 
[O.N]. 

The  function  {g(k)}  plays  the  role  of  an  envelope  or  amplitude  modulation  on  the 
random  noise  input.  It  is  also  obtained  from  the  data  as  described  in  the  next  section. 
In  statistical  terminology  Eq.  (1)  represents  a non- stationary  autoregression  model. 


f 
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C.  Parameter  Estimation 

We  estimate  the  parameters  a^^^  by  maximum  likelihood  techniques.  Since  no  sta- 
tistical theorems  were  available  to  determine  the  properties  of  maximum  likelihood 
estimators  for  non- stationary  autoregressive  models,  we  proved  a consistency  result.^ 
In  its  multi-variable  form,  the  theorem  is 

Theorem  - Consistency 

Let 

z(k)  = A(k  - l)z(k-  1)  + v(k)  , 
where  z,v  are  p- vectors,  and 


(4) 


A(k)  = A.  + 


i = l ^ ‘ 


(k)  , 


where  Aq,  Aj{k),  ■ • • , A^(k)  are  given  bounded  pxp  matrices.  Finally,  v(k)  is  a vector 
whose  components  are  independent  Gaussian  white  sequences,  and  whose  non-singular 
covariance  matrix  is 

E{v(k)v^(k)}  = diag(g^^{k))  - G(k)  . 

The  constants  a?  represent  the  true,  unknown,  parameters  that  are  to  be  estimated 
from  the  data. 


The  likelihood  function  for  N data  points  is 


^N<^l'"-'^m^  [^0  + l)]z{k-  1)||^-1 

k=l  i=l 

If  a(N)  is  the  m-vector  that  minimizes  Lj^(a^,  • • • , a^)  for  the  N observations 
z(l),  • • • , z(N)  and  if  the  matrix  Q(N)  satisfies  for  large  N 

Q(N)  > Q (Positive  definite) 


(5) 


where 


Qij  = 1 [A. 
^ k=l  ^ 


.(k-  l)y(k-  l)]^G’^k)[A.(k-  l)y(k-l)l  , 


(6) 


then 


lim  a(N)  = a^  , where  a*^  is 


n|i= 

the  true  parameter  vector,  with  probability  one. 
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Having  this  result,  we  know  that  if  one  has  sufficient  data,  the  maximum  likeli- 
hood estimates  will  converge  to  a set  of  limit  values.  This  is  a very  important  point 
that  cannot  be  overlooked.  When  we  first  estimated  these  parameters  in  Refs.  1,2, 
we  applied  recursive  estimation  techniques,  in  particular  non-linear  filtering,  via  the 
so-called  Schwartz- Bass  approximation.^ 

This  approximation  expands  the  non-linear  terms  about  the  (unknown)  optimum 
estimate.  Keeping  terms  up  to  and  including  second  powers,  a closed  set  of  approxi- 
mate filter  equations  are  obtained  which  can  be  solved  from  the  data  to  obtain  approxi- 
mate estimates.  We  found  from  our  non- stationary  model,  that  sometimes  the  esti- 
mates would  converge  and  we  obtained  good  results,  other  times  the  estimators  would 
diverge  and  yield  an  overflow  in  our  computer  outputs.  When  we  estimated  the  coef- 
ficients from  data  simulated  by  known  non- stationary  models,  we  found  that  if  our  initial 
guess  was  close  to  the  known,  true  values,  the  Schwartz- Bass  filter  would  yield  esti- 
mates close  to  the  true  values.  On  the  other  hand,  if  the  initial  errors  were  relatively 
large,  the  computations  would  become  unstable.  We  believe  that  this  was  due  to  the 
fact  that  since  the  recursive  technique  updates  the  estimate  only  one  or  two  points  ahead 
in  time,  it  could  not  see  that  after  an  initial  rise,  say,  in  the  amplitude  characteristic 
of  the  data,  it  then  decreases.  Hence,  the  filter  would  read  this  as  requiring  an  un- 
stable system  to  fit  the  rise,  which  would  then  cause  an  overflow.  We  also  mention  in 
passing,  that  the  so-called  extended  Kalman  filter  proved  (in  our  experience)  to  be  a 
poor  estimator  for  coefficients  even  in  the  case  of  stationary  models. 

Since  maximum  likelihood  estimates  make  use  of  the  entire  data,  and  since  we 
know  that  these  estimates  are  consistent  by  the  theorem  above,  then  we  are  always  as- 
sured of  stable  computational  results. 

In  our  autoregressive  model  Eq.  (1)  the  input  u(k)  is  multiplied  by  the  function 
g(k),  which  basically  plays  the  role  of  the  amplitude  envelope  of  the  original  state  vari- 
able y(k).  To  a certain  extent  the  function  g(k)  separates  the  non- stationary  amplitude 
characteristics  of  the  data  from  the  non- stationary  frequency  characteristics  of  the 
data.  In  order  to  see  this,  we  need  only  recognize  that  the  envelope  g(k)  is  slowly  vary- 
ing relative  to  y(k).  Thus,  upon  dividing  the  equality  Eq.  (1)  by  g(k)  (note:  g(k)  > 0), 
the  non- stationary  amplitude  characteristic  is  removed  from  the  data.  In  our  initial 
studies,  we  attempted  to  fit  this  envelope  by  linear  combinations  of  two  or  three  expo- 
nentials, but  found  that  fitting  the  initial  part  of  the  envelope  well,  would  lead  to  large 
errors  in  the  later  stages,  and  vice  versa. 

Hence,  we  were  led  to  use  another  approach.  We  find  that  fitting  the  envelope  of 
the  data  by  cubic  spline  techniques  leads  to  excellent  results.  That  is,  cubic  polynomi- 
als are  fit  to  various  sections  of  the  data,  so  that  the  points  at  which  two  polynomials 
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meet  have  equal  derivatives.  Thus,  the  envelopes  become  smooth  curves.  A typical 
envelope  is  shown  in  Figure  3. 


Fig.  3. 

Our  procedure  therefore  is  as  follows.  Given  a set  of  data,  which  typically  from 
the  Cal  Tech  reports  is  .02  seconds  apart,  our  program  first  calculates  the  envelope 
function  g(k)  directly.  Having  the  envelope  function,  and  for  a given  value  of  f and  m 
in  Eq.  (2),  maximum  likelihood  estimates  for  the  unknown  coefficients  a^^j  are  deter- 
mined for  the  given  g(k). 

In  this  way  a model  of  the  form  Eq.  (1)  is  fitted  to  the  given  non- stationary  time 
series. 


D.  Quality  of  the  Model 

The  natural  question  for  any  model  fitting  procedure  is,  "Is  the  model  good"? 


We  can  certainly  look  qualitatively  at  the  model  by  generating  records  from  Eq. 
(1)  with  the  calculated  g(k)  and  the  estimated  This  simulation  simply  requires 

that  white  Gaussian  sequences  be  generated  with  variance  proportional  to  .02,  and 

used  as  an  input  to  generate  the  y(k)'s  in  Equation  (1). 
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Typical  simulated  records  corresponding  to  Figs.  1 and  2 are  shown  in  Figs.  4 
and  5 respectively.  Qualitatively,  they  look  reasonably  good.  But  the  more  important 
question  is  how  shall  we  judge  this  model  relative  to  other  possible  models? 


OUTPUT  OF  SIMULATOR 


Fig.  4. 


OUTPUT  OF  SIMULATOR 


Fig.  5. 

In  our  attempts  to  apply  non-linear  filtering  methods  in  Refs.  1 and  2,  we  studied 
the  quality  of  the  model  by  studying  the  so-called  residuals.  The  residual  is  the  nu- 
merical difference  between  the  estimated  data  value  (from  the  model)  and  the  actual 
data  value.  Theory  tells  us  that  the  residuals  should  become  a white  noise  sequence. 
Hence,  we  applied  statistical  tests  to  ascertain  the  whiteness  of  the  residuals.  Para- 
meter estimates  were  accepted  if  the  residuals  passed  several  "whiteness"  tests  at  the 
95%  confidence  levels. 

This  approach  appeared  to  be  adequate  in  lieu  of  convergence  theorems,  but  it  is 
difficult  to  judge  the  relative  quality  of  a given  model  in  this  way. 

Fortunately,  there  is  a recent  approach  to  fitting  autoregressive  models  due  to 
Akaike  that  appears  to  possess  the  potential  for  being  a very  useful  tool. 

Starting  with  the  Kullback- Liebler  mean  information 

eLj  £12121 

t p(x  1 9) 
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for  discriminating  between  two  conditional  densities  p(X  | 6)  and  p(X  | 6),  Akaike  derived 
the  information  criterion  (AIC). 

Akaike  Information  Criterion 

Choose  the  model  such  that 

-2  log  p(y^|ej^(N))  + 2K  (7) 

is  minimized,  where  is  the  data,  0t/,(N)  represents  the  maximum  likelihood  esti- 
mates  of  the  unknown  parameters  for  the  N data  values,  and  K represents  the  number 
of  independent  parameters. 

He  developed  the  AIC  for  stationary  time  series  and  has  applied  the  criterion  to 
a number  of  statistical  problems  including  estimating  the  orders  of  auto- regressive 
moving-average  models  with  excellent  success. 

In  order  to  derive  the  criterion,  Akaike  requires  two  conditions 

(i)  0^(N)  is  a consistent  estimator 

Jn. 

(ii)  Jn0„(N)  is  asymptotically  normal. 

J\ 

For  our  non- stationary  model  Eq.  (1)  we  know  from  the  theorem  of  Section  C that 
Eq.  (8){i)  holds.  Fortunately,  we  are  able  to  establish  Eq . (8)(ii)  for  our  model  also, 

g 

via  a Martingale  Central  Limit  theorem. 

Thus,  the  AIC  criterion  holds  for  our  non- stationary  models  as  well. 

We  have  applied  the  AIC  criterion  to  study  the  best  choices  of  I , the  order  of  the 
model,  as  well  as  the  choice  of  the  non- stationary  coefficients. 

To  study  the  order  of  the  model,  the  procedure  is  as  follows.  We  first  calculate 
the  envelope  function  g(k).  This  remains  fixed  throughout  all  succeeding  computations. 

For  each  assumed  value  of  t , determine  the  maximum  likelihood  estimates  for 
the  a^'s.  We  note  that  for  a given  number  m,  of  orthogonal  polynomials  used  in  the 
coefficients,  there  will  be  a maximum  of  I • m coefficients  to  estimate.  Thus  1 m plays 
the  role  of  K in  Equation  (7). 

For  each  f , and  each  set  of  computed  estimates  a_  for  that  order  f , the  expres- 
sion Eq.  (7)  is  calculated.  That  integer  f for  which  Eq.  (7)  takes  its  minimum  value  is 
chosen  as  the  best  order  fit  to  the  data. 

We  studied  the  best  order  fit  of  autoregressive  models  of  the  form, 

y(k)  + a^y(k-  1)  + • • • + a^_^  y(k- i + 1)  + a^(k-  i)y(k- I)  = g(k)u(k)  , (9) 


SYSTEMS,  CONTROL  AND  NETWORKS 


451 


where  a^,  • • • , a^_  ^ are  unknown  constants  to  be  estimated  and  a^(k)  is  a time  varying 
term  of  the  form  Eq.  (2)  for  m = 5. 

The  results  are: 


IIB028-N88W 


i 

AIC  (f) 

2 

3.4210x.0^ 

3 

2.8856x10^ 

2.9244x10^ 

IIB031-S69 


I 

AIC  m 

2 

2.2142x10^ 

3 

2.  2702x  10^ 

4 

2.  2460  X 10^ 

Many  more  computer  experiments  are  being  performed  via  the  AIC  and  a number 
of  interesting  points  are  evolving.  For  example,  records  with  a short  strong  motion 
period  seem  to  be  fit  best  by  second  order  models.  If  the  strong  motion  period  is 
longer,  third  or  fourth  order  models  appear  best.  If  the  unknown  coefficients  are  all 
assumed  to  be  constant  (stationary  case),  the  best  fit  may  be  tenth  order  or  higher. 

Further  studies  are  being  performed  to  ascertain  the  significance  of  this  approach 
to  studying  non- stationary  time  series. 


National  Science  Foundation 
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RESTORATION  OF  IMPULSE  TRAIN  SIGNALS  FROM  SMOOTHED  DATA 
A.  Papoulis  and  C.  Chamzas 

A central  problem  in  signal  analysis  is  the  determination  of  a signal  f(t)  in  terms 
of  a band  of  its  spectrum 

F (cj)  = F(io)  P (w)  (1) 

(T  O' 

where 

F(oo)  = ^ r f(t)e^‘^’^dt  (2) 

-00 

and 

1 ( U)  I < (T 

P (u)  . (3) 

0 |gJ  I > (T  . 

Since  only  a smoothed  version  is  known,  f(t)  can  not  be  determined  exactly.  If  the  fQ(t) 
is  used  to  estimate  f(t)  then  an  error  f(t)  - f^(t)  results.  Generally  two  methods  are 
used  for  thi  reduction  of  this  error.  Windows  and  extrapolation. 

Windows;  The  known  band  of  the  signal  is  multiplied  by  a function  W(uj)  and  the  inverse 
Fourier  transform  f^(t)  the  resulting  product 

F (u))  = F (oo)  W{u))  (4) 

w O’ 

is  used  as  the  estimate  of  f(t).  The  function  W(w)  is  so  chosen  as  to  reduce  in  some 
sense  the  resulting  error  f(t)-f^(t).  The  main  advantage  of  this  method  is  its  compu- 
tational simplicity.  However,  it  replaces  the  unknown  part  of  F(w)  with  zero.  The 
method  of  windows  has  been  extensively  treated.^ 

Extrapolation;  The  known  band  F^(u))  is  extrapolated  beyond  | to  ) > c and  the  inverse 
f^(t)  of  the  so  formed  spectrum,  F^(to)  is  used  as  the  estimator  for  f(t).  The  nature  of 
the  extrapolation  depends  on  various  apriori  assumptions  and  it  is  in  general  arbitrary. 
It  is  however  no  more  arbitrary  than  the  assumption  that  F(to)  = 0 for  | to  | > tr  implicit 
in  the  method  of  windows  . A special  form  of  extrapolation  technique  is  the  method  of 
maximum  entropy.  It  is  based  on  the  assumption  that  the  entropy  of  f(t)  is  maximum 
subject  to  the  constraint  that  its  Fourier  transform  F(to)  is  given  for  ( to  | < tr  (see 
Reference  2).  In  Ref.  3 a new  method  on  extrapolation  is  presented  based  on  the 
assumption  that  f(t)  is  time- limited,  i.e., 

f(t)  = 0 for  ( 1 1 > T . 

In  the  next  section  we  present  an  iteration  scheme,  based  on  Ref.  3,  where  the 
extrapolation  of  F (to)  depends  on  the  assumption  that  f(t)  consists  of  impulses,  i.e.. 
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6(t-t.) 
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(6) 


We  note  that  from  Eq.  (6)  and  Eq.  (1),  in  the  discrete  case,  a system  of  nonlinear  equa- 
tions for  a.  and  t.  can  be  formed.  However  the  solution  is  computationally  complex  and 

noise  sensitive. 

A.  Description  of  the  Iteration 

We  are  given  the  band  F^(u()  of  F{u)  and  we  wish  to  determine  f(t)  under  the  as- 
sumption of  Eq.  (6) 

f(t)  = |^a.6(t-t.)  . 

We  use  the  following  iteration.  (Figure  1.)  We  start  with  F^(a))  = F^lw)  then  if 

G (cj)  is  the  estimator  for  F(u))  at  the  n-th  step  of  the  iteration  we  form  the  spectrum 
n'  ' 

= G^(oj)  [j.  - P^(u)]  +F^(a)) 

Thus  F (cj)  is  a better  estimator  since  it  is  obtained  by  replacing  the  band  of 
n+  1 

from  -ff  to  a with  the  known  band  of  F(u)). 


Then  we  compute  the  inverse  fj^(t)  o£ 


FJ.) 


(8) 


and  we  form  the  function 


8n  + l^‘’  = 


_lf„(t)  if  !-„(t)|>C^,^ 

0 if  |fn<i)l<C„,i 


(9) 


by  setting  the  parts  of  f^(t)  which  are  less  than  equal  to  zero.  The  truncation 


level  C is  defined  as 
n 


■"n+l 


K 


if  K < C_ 
n max 


(10) 


Cmax  ^n  ^ "^max 


where  K is  the  minimum  value  of  | g (t)  | excluding  the  zero.  The  use  of  the  trunca- 
n ** 

tion  level  C utilizes  the  additional  information  that  f(t)  is  a signal  of  the  form 
n 


f(t)  = 


;6(t-tJ 


The  choice  of  C is  not  critical  as  long  as  the  remaining  non-zero  part  of  gj(t)  contains 
all  the  pulses  locations  t. . The  choice  of  depends  on  our  apriori  knowledge  of  a. 

and  it  must  be  less  than  the  minimum  expected  value  of  a^,  that  is 
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C < minfa.)  nn 

max  ' 1 

As  a result  of  using  C^,  at  each  step  of  the  iteration,  is  concentrated  closer  to  t. 

and  the  length  of  the  truncated  region  becomes  smaller  and  smaller.  At  the  end  of  the 
iteration  when  the  pulses  have  been  recovered  the  remaining  non-zero  part  of  g ,(t) 
in  Eq.  (9)  consists  only  of  the  points  L. 

B.  Extension 

If  the  noise  is  not  white  but  colored,  a proper  window  can  be  inserted  in  the 
algorithm  weighting  properly  the  given  part  of  F(u)).  Then  instead  of  Eq.  (7)  we  have 

= G^(to)[  1 - W{to)]  + F^((;j)  W(u))  with  0<  W(co)<  1 (12) 

Also  a constant  A^  can  be  inserted  in  the  algorithm  to  speed  up  the  convergence.  Then 
Fn<“)  = ^ +F^{aj)W{u))  (13) 

where  A is  chosen  from  the  minimization  of 
n 

00 

/ |f(u))- A G (u))|  [l  - W(w)ldu>  = min 

- C30  n n L J 

Applications  of  the  presented  algorithm  include  detection  of  double  or  multiple 
stars,  clutter  elimination,  determination  of  hidden  periodicities  and  estimation  of  the 
location  of  impulse  trains  from  smoothed  data.  Extension  of  the  method  in  two-dimen- 
sional problems  include  sharp  edge  image  restoration  and  line  enhancement.  Examples 
are  given  below  indicating  some  of  the  applications. 

Example  1;  The  first  example  is  from  the  estimation  of  the  location  of  impulse  trains 
from  smoothed  data. 

The  unknown  signal  consists  of  six  pulses  with  different  amplitudes  located  as 
follows.  (At  is  the  sampling  interval.) 

t. 

Location  36  41  48  52  59  64 

Az 

Amplitude  a^  0.9  0.8  1.00  1.00  0.8  0.9 

and 

f(t)  = ^a^6(t-L)  +n(t)  (14) 

where  n(t)  is  white  noise  uniformly  distributed  in  [-0.05,0.05].  Figures  2(a)  and  (b) 
show  f(t)  and  the  amplitude  (f(cj)|  of  its  Fourier  transform.  The  given  part  of  the  F(a)) 
is  shown  in  Fig.  3(b)  and  its  inverse  signal  f^(t)  in  Figure  3(a).  Notice  that  the  six 


458 


SYSTEMS,  CONTROL  AND  NETWORKS 


SYSTEMS,  CONTROL  AND  NETWORKS 


459 


added.  (Its  band  is  from  -6  to  6.)  The  signal  is  entirely  unrecognizable.  Filtering 
the  low  band  by  a high-pass  filter  we  remove  the  noise  but  again,  as  it  appears  in  Fig. 
7(a),  the  signal  is  unrecognizable  since  its  low  frequencies  are  missing  (see  Figure 
7(b).  We  apply  the  algorithm  for  restoring  these  missing  low  frequencies.  The  itera- 
tion uses  the  formula  Eq,  (12)  with 

. W(u.)  = 1 - P^(oj)  (16) 


The  result  of  the  iteration  is  shown  in  Figs.  8(a),  (b)  and  (c)  for  the  0-th,  3-rd  and 
10-th  steps.  The  signal  again  has  been  restored  completely.  Therefore  by  using  the 
above  algorithm  we  removed  the  filter  effects. 


Example  3;  Hidden  periodicities  . 


In  a number  of  applications,  the  signal  f(t)  is  a sum  of  sine  waves  of  unknown 
amplitudes  a^  and  frequencies  (sun  spot  variations  for  example).  The  transform 
F(u))  of  f(t)  consists  of  lines  (impulses)  at  co  however  since  only  the  segment 


f(t)  |t|  < T 

0 |t|  > T 


of  f(t)  is  available,  their  locations  can  not  be  determined  in  terms  of  the  transform 
F.j,(cj)  of  f.p(t).  Interchanging  t and  cj  on  the  above  problem  we  are  back  in  the  problem 
of  detecting  an  impulse  train  from  its  smoothed  data.  In  this  example  we  consider  the 
case  of  determining  the  frequency  and  the  amplitude  of  a sinusoid  having  only  a part  of 
its  period.  The  given  part  is  about  the  half  period  of  an  1 Hz  period  and  it  is  distorted 
by  10%  white  noise.  In  Fig.  9 we  have  11  samples  of  the  given  part.  The  algorithm  is 
applied  to  this  sinusoid  using  a fast  Fourier  transform  with  256  points.  Figure  10  shows 
how  the  spectrum  line  of  the  sinusoid  is  emerging  during  the  algorithm.  Its  location 
is  exactly  on  the  1 Hz  point  of  the  spectrum.  This  result  is  more  accurate  than  the  one 
obtained  in  Ref.  4 through  the  maximum  entropy  method. 


Advanced  Research  Projects  Agency 
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HIGH  RESOLUTION  IN  PULSE-  ECHO  SYSTEM 
A.  Papoulis  and  C.  Chamzas 

Ultrasonic  echo  information  for  clinical  diagnosis  can  be  enhanced  by  suitable 
processing  of  the  received  information.  A method,  based  on  an  extrapolation  scheme 
proposed  in  Refs.  1, 2 is  developed  for  improving  the  range  resolution  of  an  ultrasonic 
pulse- echo  system.  The  ambiguity  in  the  location  and  the  amplitude  of  the  return 
signal  y(t)  due  to  noise  and  to  interference  from  overlapping  echoes  is  reduced  by  re- 
storing the  frequency  components  of  the  impulse  response  h(t)  of  the  medium  that  are 
outside  the  significant  band  of  the  transmitted  signal  X(t). 

The  principle  of  operation  is  the  following.  A transducer  F generates  an  acoustic 
beam  x(t-  kz)  traveling  in  the  z-direction.  This  beam  is  reflected  by  parts  of  the  body 
along  its  path  LL.  A wave  y(t  + kz)  results,  propagating  in  the  opposite  direction  (see 
Figure  1).  The  function  y(t)  depends  on  the  reflection  coefficients  r(z)  along  LL  and  is 
used  to  estimate  r(z).  The  instruments  in  current  use  are  based  on  simple  schemes 
for  estimating  r(z)  (envelope  detection). 


Fig.  1. 

Under  various  simplifying  assumptions  (no  multiple  reflections,  no  attenuation, 
no  beam  variation  along  the  relfecting  line  LL,  no  presence  of  noise,  etc.)  the  reflected 
signal  at  the  exit  plane  z = 0 should  be 

oo 

s(t)  = f x(t-T)h(T)dT  (1) 

- 00 

where  h(t)  = r(kz)  equals  the  reflection  coefficient  suitably  scaled.  However,  since  the 
simplifying  assumptions  are  not  completely  satisfied  the  received  signal  y(t)  is 

y(t)  = s(t)  + n(t)  (2) 

where  n(t)  in  the  present  work  is  assumed  to  be  white  noise  uniformly  distributed. 

Taking  Fourier  transforms  of  both  sides  of  Eq.  (2)  we  obtain 
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Y(u))  = X(<j)H(cj)  + N(w)  I"*) 

Since  x(t)  is  a modulated  acoustic  wave  with  a spectrum  X(lj)  negligible  outside  an  inter- 
val (ojj  ^2)  (see  Fig.  2)  the  problem  of  estimating  H(u))  can  not  be  solved  by  simply  de- 
convolving Y{u)  in  Equation  (3).  Hence,  due  to  the  presence  of  noise,  H(uj)  can  be 
reliably  estimated  only  for  w inside  this  interval. 


Fig.  2.  The  output  x(t)  of  the  transducer  and  its  Fourier  transform  pair. 
Let 


H (u) 
a 


X{u))  X(co) 


(4) 


be  the  estimator  obtained  by  deconvolving  Equation  (3).  To  reduce  the  resulting  error 
we  must  either  replace  by  zero  outside  the  reliable  interval  of  multiply 

by  a suitable  window  W(w)  that  weighs  properly  the  estimator  Thus  as  a first 

estimator  we  have  (see  Fig.  3) 
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In  the  present  work  a method  is  described  where  the  missing  frequencies  of  H^(w) 
can  be  recovered  if  the  reflecting  substance  consists  of  discontinuity  surfaces.  In  this 
case 

h(t)  = J a.  6(t-t.)  (6) 

1 

hence 

s(t)  = ^a.  x(t-  t.)  (7) 

1 

A.  Description  of  the  Method 

The  signal  y(t)  is  preprocessed  with  a nonlinear  device  N.  which  increases  signi- 

JLi 

ficantly  the  S/N  ratio  by  setting  equal  to  zero  the  parts  of  the  received  signal  where  no 
signal  exists.  This  is  based  on  the  fact  that  x{t)  is  time- limited  (see  Fig.  2)  and  there- 
fore because  of  Eq.  (7),  y(t)  has  parts  where  it  can  be  easily  recognized  that  no  signal 
exists.  In  this  step  a more  sophisticated  noise  reduction  can  be  achieved  by  interpolat- 
ing the  noise  in  the  area  where  the  signal  s(t)  exists.^ 

In  the  next  step  the  resulting  y.j,(t)  is  deconvolved  with  X((.o)  and  a very  unreliable 
output  h^(t)  is  obtained.  A window  W(uj)  with  0 < W(cj)  < 1 is  used  to  cancel  or  to  reduce 
the  weight  of  the  bands  of  the  signal  which  have  been  highly  corrupted  by  the  noise. 
These  bands,  as  we  can  see  from  Eq.  (4),  are  the  frequencies  outside  the 
terval.  Notice  that  by  using  a window  of  the  form 

W(u))  = X(co)  Wj(u;)  (8) 

the  undesired  deconvolution  can  be  avoided  (see  Equations  (4)  and  (5)). 

The  first  estimator  h^(t)  of  h(t)  is  fed  in  the  extrapolator  where  the  missing  or 
reduced  frequencies  of  h(t)  are  recovered.  The  extrapolator  (see  Fig.  4)  performs  an 
iteration  by  using  the  following  scheme. 


It  forms  the  function 
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= H^(u)  + [ 1 - W(w)]  0 < W(ui)  < 1 

and  then  sets  equal  to  zero  the  parts  of  the  signal  which  are  below  a certain  level.  That 
is 

jh^(t)  if  (h„(t)(>C^ 

gn(t)  = I , , (9) 

( 0 if  |h^(t)(  < 0 

In  the  beginning  of  the  iteration  has  a very  low  value  but  during  the  proces- 
ses C is  increased  as  follows: 
n 

If 

= min[{  | gjj.j(t)  | } - {O}] 

then 


C 


n 


K K < C 

n n max 

C if  K > C 

max  n max 


(10) 


C and  C are  two  constants  which  are  calculated  before  the  iteration  starts.  Noise 

o max 

level  and  expected  value  of  minimum  reflection  coefficient  a^^  are  the  basic  criteria  for 
defining  C and  C correspondingly.  Therefore  utilizing  the  additional  information 
that  the  reflecting  substance  consists  of  discontinuity  surfaces,  an  iterative  algorithm 
has  been  created  which  converges  fast  and  moreover  is  quite  stable  in  a noisy  environ- 
ment. The  above  iteration  has  successfully  been  applied  in  different  cases.  The 
computations  are  carried  out  digitally  in  terms  of  discrete  Fourier  series. 

B.  Illustration 

Consider  a synthetic  laminate  comprised  of  six  reflecting  boundaries,  related  to 
the  impedance  discontinuities  at  each  of  these  boundaries.  We  can  use  the  values  of 
these  reflection  coefficients  to  form  a series  of  reflective  echoes,  i.e., 

s(t)  = a t.)  (11) 

i = l ‘ ^ 


where  a^  = reflection  coefficient  occurring  at  boundary  i at  a time  L . The  synthetic 
laminate  chosen  for  this  illustration  is  shown  in  Figure  5.  The  resulted  h(t)  is  shown 
in  Figure  6,  Figure  7 shows  the  synthesized  echo  train  s(t)  and  Fig.  8 shows  the  re- 
ceived echo  y(t)  where 

y(t)  = s(t)  +n(t) 

with  n(t)  white  noise  uniformly  distributed  in  [-0.013,  0.013]  . The  signal  y.p(t)  is 
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Impedance  — I.I80  0.852  1.222  1.174  1.199  1.222 

o 


^i'  ‘ 

Reflection  coefficients  a =— — 7-= 

i Z . / Z T 1 
i o 


Fig.  5. 


Fig.  6, 

obtained  by  setting  equal  to  zero  the  part  of  y(t)  outside  the  lined  area  in  Figure  8. 

The  impulse  response  h^(t)  resulting  by  deconvolution  is  shown  in  Figure  9. 

A window  is  used  as  a band-pass  filter  (lined  area  in  Fig,  2 or  Figure  10(a)). 

The  first  estimate  h (t)  is  shown  in  Figure  10(b).  The  signal  h (t)  is  almost  noise  free 
(S/N  = 52)  but  is  highly  distorted  because  of  the  filtering  by  W(u).  Figure  11(a)  shows 
different  steps  of  the  iteration  and  Fig.  11(b)  shows  the  result  of  the  40-th  step. 
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Figure  12(a),  (b)  shows  the  same  steps  of  the  iteration  and  the  h^Q(t)  when,  instead 
of  the  band-pass  filter,  another  filter  Wj^{u))  = ) X(u))  | /max  | X(cj)  ) is  used  and  the 
error  reduction  for  the  two  filters  is  shown  in  Figure  13(a)  and  (b).  It  is  clear  that  by 
using  the  filter  Wj(u))  we  speed  up  the  convergence  in  the  beginning  of  the  iteration. 

For  comparison  purposes  the  reflection  coefficients  determined  by  the  two  filters 
may  be  compared  to  the  theoretical  values  and  are  tabulated  in  Table  I. 

TABLE  I. 


Time  AT 

a. 

1 

a.  dete 
1 

rmined  by  | 

|X(u)| 

Intervals 

Theoretical 

Band-pass  Filter 

Wi(a,)  = 

max  1 X(u> ) 1 

-29 

0.085 

0.0818 

0.0848 

- 14 

-0.080 

-0.0754 

-0.0903 

' -3 

0. 100 

0.0893 

0.0952 

i +3 

0.080 

0.0671 

0.0760 

1 11 

0.090 

0.0898 

0.0896 

26 

0.100 

0.  100 

0.110 
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A ROBUSTIZED  KALMAN  FILTER 
1.  Kadar  and  L.  Kurz 

Recently  there  has  been  considerable  interest  in  recursive  estimators  for  the 

linear  model  that  protect  the  estimate  from  outliers  and  are  robust.  These  include  the 

use  of  Huber's  M- estimator  s ^ light-limiter  robustized  stochastic  approximation  (SA) 

2 

estimates  based  upon  Huber's  results,  SA  estimators  both  of  the  Robbins- Monro  and 

Kiefer- Wolfowitz  type  with  adaptive  gain  coefficients  for  scalar  case  and  adaptive  gain 

3 4 

matrices  for  the  vector  case,  ' with  applications  to  robust  SA  estimates  using  non- 

3 

parametric  statistics  and  general  (not  necessarily  robust)  estimation  theory  based 
upon  discrete-time  point  processes  using  martingales.  The  desired  robust  quality  of 
the  estimator  is  usually  assessed  in  terms  of  not  only  the  extent  to  which  it  removes 
outliers  in  the  presence  of  contamination,  but  rather,  by  its  performance  in  unknown, 
possibly  asymmetric,  noise  environment^  reaching  asymptotic  normality  with  practical- 
ly moderate  sample  sizes,  unbiased  and  perhaps  optimal  (although  not  necessarily  BLUE) 
in  some  sense  when  constructed  for  one  kind  of  noise  environment,  and  in  fact,  another 
noise  environment  holds. 

Most  of  the  above  cited  work  on  robust  estimation  for  the  linear  model  has  been 
applied  to  either  point  estimation  (e.g..  Refs.  1 and  7)  or  to  detection.  Ref.  3,4,9,  with 
the  exception  of  Ref.  8,  where  robustized  one- step  Bayesian  like  estimators  were 
sequentially  applied  to  robustize  the  Kalman  Filter  (KF)  for  either  (symmetric  heavy- 
tailed) non-Gaussian  state  or  (symmetric  heavy- tailed)  non-Gaussian  measurement  noise 
densities.  The  work  in  Ref.  8 considered  the  application  of  M- estimates  and  min-max 

theory  for  the  e -contaminated  normal  family  combined  with  results  from  influence  func- 

2 10 
tion  robustized  SA-theory  using  the  linear-non-linear  (L-N)  robustizing  approach. 

The  L-N  approach,  which  applies  first  a linear  transformation  to  scale  and  symmetrize 
the  CDF  of  the  innovations  and  then  operates  on  the  result  with  a non-linear  (i.e., 
light- limiter)  robustizing  transformations,  suffers  from  a major  drawback.  When  the 
underlying  noise  CDF  is  unknown,  it  is  impossible  to  find  the  linear  transformation, 
so  the  CDF  is  assumed  to  be  Gaussian,  the  linear  transformation  computed  and  the 
nonlinearity  is  applied  with  the  hope  that  it  will  robustize  the  estimate  when  the  CDF  is 
non-Gaussian.  Clearly,  when  the  CDF  is  non-Gaussian,  the  assumed  linear  transforma- 
tion is  inappropriate  and  a number  of  approximations  are  needed  in  order  to  justify 
the  results  of  Reference  8. 

The  material  presented  here  is  a generalization  of  the  work  of  Masreliez  and 
Martin®  in  robustizing  the  KF.  We  first  establish  the  background  for  the  L-N  approach 
from  MLE  and  KF  point  estimation  viewpoint  and  show  the  asymptotic  relationships 
between  KF  and  SA,  This  leads  directly  to  the  one- step  robust  L-N  estimator  in 
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Theorem  1.  We  then  show  that  it  is  possible  to  avoid  the  ambiguity  in  selecting  the 
linear  transformation  by  the  use  of  a batch-integer- rank-linear  preprocessing  trans- 
formation on  the  innovations  (batch- non- linear- linear  B-N-L  approach)  using  the  two- 
sample  Mann- Whitney- Wilcoxon  nonparametric  statistics  (MWWNS)  from  the  Chernoff- 
Savage  class^^  in  batch  processing  (in  the  spirit  and  form  of  Evans,  Kersten  and  Kurz 
and  Kersten  and  Kurz^  generalizing  Theorem  3 in  Reference  8.  We  consider  the  case 
when  the  measurement  noise  probability  density  function  (pdf)  is  described  by  the  mix- 
ture distribution  model 

= {f(' ):  ) = (1  - E ) g(x)  + £ h(x),  h(-  )e  0 £ E < 1} 

where  g(x)  is  a given  density  function  and  is  a wide  class  of  density  functions,  where 
h(-)£  ^ is  usually  a high  variance  pdf  representing  a burst  noise  component  of  the 
measurement  channel.  The  case  of  non-Gaussian  state  noise  pdf  (Theorem  4 in  Ref.  8) 
is  not  considered  since  our  generalized  results  in  Theorem  2 (the  B-N-L  approach) 
directly  apply.  The  proof  of  Theorern  2 is  similar  to  methods  in  Refs.  3,  4 and  8,  and 
follows  from  a significant  result  from  the  Chernoff- Savage  theory  on  asymptotic  nor- 
mality and  symmetry  properties  for  a class  of  linear  two- sample  statistics  and  its 

extensions.^ along  with  results  for  the  particular  case  of  the  MWWNS  which  reaches 

14  15  . 

asymptotic  normality  practically  with  as  few  as  eight  batch  samples  ’ used  in  robust 
SA  estimation  of  location.^  This  means  that  after  a preprocessing  delay  of  m- steps, 
the  MWWNS  robustized  filter  based  upon  the  nominal  Gaussian  GDF's  becomes  asymp- 
totically (m>8)  Gaussian  and  is  essentially  unaffected  by  the  degree  and  nature  of  the 
contamination,  i.e.,  by  the  limiting  effect  of  the  integer-rank  transformation. 

So,  in  effect,  we  immunize  the  measurements  against  burst  noise  and  our  filter  is 
robust  for  non-Gaussian  noise. 

In  order  to  establish  a relative  comparison  measure  between  L- N and  B-N-L  ap- 
proaches, we  convert  the  robust  KE  problem  to  a robust  point  estimation  problem  and 

apply  asymptotic  results  from  robust  Robbins- Monro  (RMSA)  procedures  based  upon 

Z 3 

the  work  of  Evans,  Kersten  and  Kurz  ' to  compute  variance  bounds  of  the  resultant 
estimates.  This  allows  the  computation  of  a criteria  based  upon  the  relative  sample 
number  ratio  (RSNR)  of  the  B-N-L  and  L- N approaches. 

It  should  be  noted  that  the  vector  KF  is  addressed  directly  in  this  report  and  we 

refer  to  the  description  of  the  general  linear  discrete- time  model  in  Ref.  16  for  the 

familiar  minimum  variance  KF.  Our  direct  solution  of  the  vector  case  is  also  based 

. z 

upon  the  adaptation  of  some  recent  results  on  robust  vector  RMSA  by  Kersten  and  Kurz. 

Finally,  Monte  Garlo  simulation  results  and  numerical  examples  substantiate  the 
theory  by  verifying  the  improved  performance  of  the  B-N-L  filter  robustized  via  the 
MWWNS  over  the  L- N light- limiter  robustized  filter. 
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A.  Robust  Filtering  Via  the  L-N  Approach;  Some  Known  Results 
We  intend  to  robustize  the  well-known  linear  model 


^k  = “k’^k  ''k  ' 

where  yj^  is  a (v  x 1 ) measurement  vector,  Hj^  is  a (rxn)  measurement  matrix,  Xj^  an 
(nx  1)  state  vector  and  Vj^(rx  1)  measurement  noise.  Furthermore,  assume  that  the 
state,  Xj^  evolves  from 


’‘k  = Vk-iVi  +'^k-i 

where  ^ is  (nxn)  state  transition  matrix  and  w^^  is  the  plant  noise.  With  ^nd 

{wj^}  zero  mean  mutually  independent  white-noise  sequences  which  are  also  independent 

of  the  initial  state  x (which  is  also  Gaussian)  leads  to  the  minimum  variance  Kalman 
o 

Filter  (KF)  described  in  detail  in  Reference  16.  As  discussed  in  Ref.  16,  the  estimate, 
Xj^  at  stage  n is  formed  by  the  prediction  ^ ^ 1 corrected  by  a term  propor- 

tional to  the  innovation,  yj^  - Hj^Xj^. 

Now  consider  the  case  where  the  measurement  noise  is  heavy-tailed  and  the  plant 
noise  is  Gaussian.  The  large  innovation  vector  in  this  case  is  due  to  the  outliers  in  the 
measurement  noise  density.  So,  intuitively,  we  would  want  to  attenuate  the  large  in- 
novations by  some  operation  and  place  more  emphasis  on  our  predictions.  This  intuitive 
reasoning  leads  one  directly  to  the  robust  one- step  Bayesian-like  estimator  and  multi- 
step  application  thereof,  comprising  the  basis  of  the  "L-N  approach.  " In  the  following, 
we  present  a brief  review  leading  to  the  development  of  the  one- step  Bayesian  L-N 
estimator  for  the  case  of  the  e - contaminated  normal  family  to  allow  a direct  means  of 

comparison  with  the  B-N-L  approach  realized  in  Theorem  2 in  the  next  section.  CXir 

17 

approach  in  this  area  is  similar  to  Ho,  leading  directly  to  Theorem  1, 

1 . Point  Estimation  (MLE  and  KF)  and  Stochastic  Approximation 

Consider  the  maximum  likelihood  estimate  of  x,  given  in  Eqs.  (1)  and  (2)  with 
Hj^  = H,  <)>|^  1 ” '^k  ~ k.  It  is  well  known  that^^ 
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Applying  the  Sherman- Morrison  Matrix  Inversion  Lemma^^  to  Eq.  (4),  we  obtain 

the  KF  equations  as  in  Ref.  16  and  applying  the  resultant  Kalman  gain  term,  i.e., 

T T - 1 17 

{HPj^_  jH  + I)  to  itself,  we  obtain,  following  Ho  in  the  limit 

lim  P = (l/k)P  H^(HP  (5) 

k— 00  ^ o o 

and  we  obtain  asymptotically 

"k=  Vl  +(l/k)pX(HP^HV^yk‘»^-l>  (6) 

4 

which  is  in  the  form  of  a multidimensional  SA.  Now  with  SA  and  KF  asymptotically  re- 
lated (for  the  point  estimation  case)  it  is  quite  natural  to  use  asymptotic  results  from 
SA  theory  as  bounds  for  the  approximate  KF  structures. 

The  above  analysis  is  directly  applicable  to  robust  SA  using  a general  class  of 

4 

nonlinear  regression  functions  (see  recent  results  of  Kersten  and  Kurz  ).  Furthermore, 

2 

the  previous  asymptotic  scalar  result  of  Martin  for  the  soft-limiter  robustized  SA 
(Huber's  M-estimator)  case  can  now  be  extended  rigorously  to  the  vector  case  (see 

4 

Kersten  and  Kurz  ). 

2.  The  One- Step  Robust  Bayesian  Form 

Now  let  in  Eq.  (3),  k = 1,  x^  = x and  = M,  and  introduce  an  odd  symmetric  non- 
linearity, with 

Ik,  t > k 
t,  (t|  <k 
-k,  t < -k 

the  light- limiter  influence  function  operating  on  the  innovation- like  term,  i.e.  , iKy-  Hx) 
with  the  following  two  properties  of  the  distribution  of  the  innovation,  v = y-  Hx  assumed 
to  hold : 

(PI)  F^(Vj  , v^,  • ■ • , V.,  • • • , v^)  = 1 - F^(v^  , v^,  • • • , -(Vj^  +0),  • • • , -v^) 

(P2)  All  marginal  distributions  for  F^  have  absolutely  continuous  densities  and 

are  members  of  F with  FeF  where 

e 

F^  = {F(-  )/F  = (1  - e )N  +e  H,  0 < e < 1,  H symmetric) 
and  N is  the  unit  normal  CDF,  and  we  obtain  the  Bayesian  one- step  robust  estirnator: 

g 

Theorem  1;  Consider  the  measurement  relation,  Eq.  (1),  and  assume  that  the 
CDF,  F^  of  V = y-  Hx,  satisfies  (PI)  and  (P2).  Let  the  prior  density  for  x,  x be  Gaussian 
n(*/ x,M).  Then  the  estimator 
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X = X + MH^iKv) 
will  yield  a bounded  variance 


(7) 


P = E[(x-  x){x-  x)^]  < M - MHH^MEj,^[i|j^] 


(8) 


where  4j(-)  is  an  (rxl)  vector  operator  with  components  4).(t)  =4j  (t.)  where  4>  (')  is  the 

1 r i r 

influence  function  corresponding  to  the  univariate  family  F.  E„  [•]  denotes  expectation 

X o'" 

for  the  least  favorable  pdf  in  F and  ij;'  E diJ/^/dt. 

r r 

Proof;  The  proof  is  given  in  Ref.  8 and  in  part  of  the  proof  of  the  robust  filter 
realized  in  Theorem  2 in  the  next  section.  It  follows  by  direct  substitution  and  by  the 
application  of  a bound  on  the  asymptotic  variance  of  the  light- limiter  robustized  SA 
estimator  coupled  with  Huber's  min-max  result  for  the  e -contaminated  normal  family 
with  one  added  crucial  assumption  not  mentioned  in  Reference  8.  For  the  vector  case 
under  consideration,  the  robust  SA  vector  estimator  used  in  establishing  the  asymptotic 
bound  has  to  be  decoupled  with  a diagonal  gain  matrix  and  the  light- limiter  nonlinearity 
applied  component- wise  to  the  regression  function.  So  in  essence,  we  have  r SA  pro- 
cedures operating  in  parallel.  Then  Huber's  results  adapted  by  Martin^  follow  compo- 
nent-wise, i.e.,  with  a diagonal  covariance  matrix  whose  elements  are 


1 


A^E„(4;^) 

1 F ' 

2 A.E_(4i.')-  1 

1 F 


! 

i 

i 

! 

\ 


where  A.  = F and  di. 

X o . ^1 


-f  ./f  ■ with  f the  least  favorable  pdf  in  F which  leads  to  a 


oi  Ol 


SA  estimator,  T , such  that  (F,  T ) ,<  V3^.{F  ,T  )V  iforFcF  where 

=eVo(+;)- 


The  statement  of  Theorem  1 and  its  proof  assumed  that  the  CDF  of  the  innovation 
vector  satisfied  the  symmetry  and  scale  conditions  described  in  properties  (PI)  and 
(P2).  Note,  however,  that  this  is  seldom  the  case  and  one  has  to  first  find  a linear 
transformation,  T,  on  the  innovation  vector,  i.e.,  v = T(y  - Hx)  and  then  apply  the  non- 
linearity (the  L-N  approach)  having  satisfied  properties  (PI)  and  (P2).  This  requires 
a priori  knowledge  of  the  CDF  of  the  innovation  vector  which  is  somewhat  contrary  to 
the  notion  of  robustness.  The  situation  becomes  even  more  complicated  (in  Theorem  3 
of  Ref.  8)  where  the  one- step  Bayesian  estimator  is  sequentially  applied  to  form  a 
robust  approximate  filter  which  provides  protection  against  measurement  noise  outliers. 
Now,  T becomes  Tj^  and  it  has  to  be  recomputed  at  each  step.  This  is  the  basic  dis- 
advantage of  the  "L-N  approach,  " as  pointed  out  before.  However,  the  above  disadvan- 
tage can  be  eliminated  by  replacing  the  L-N  operation  by  an  m-step  B-N-L  preprocessor 


"1 
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using  the  symmetric  version  of  the  MWWNS  in  batch  processing.  We  avoid  the  ambigu- 
ity of  selecting  T,.  and  we  asymptotically  (after  m-preprocessing  steps)  approximate  a 
K. 

robust  Gaussian  filter  resulting  in  a true  robust  structure  whose  operation  is  essential- 
ly independent  of  the  underlying  CDF's.  This  structure  is  realized  in  Theorem  2 of  the 
next  section. 

B.  The  Batch  Nonlinear- Linear  (B-N-L)  Preprocessing  Approach 

Consider  the  sequential  application  of  the  basic  one- step  Bayesian  estimator  de- 
lineated in  Theorem  1 with  the  L-N  operation  changed  to  the  B-N-L  preprocessor 
operation.  This  consists  of  forming  a batch  of  samples  of  size  m (of  the  components 
of  the  innovation  vector)  and  selecting  a nonparametric  statistic  of  the  form; 


m 

W 


1 


m. 


'”l'"2 


(•)  = (l/m^m^)  Yj  Z sgn(X.  - Y.) 

i=lj=l  ^ 


which  is  the  symmetric  version  of  the  two  sample  Mann- Whitney- Wilcoxon  nonpara- 
metric statistics  (MWWNS)  from  the  Chernoff- Savage  class.  However,  before  we  state 
the  B-N-L  approach  realized  in  Theorem  2,  it  is  necessary  to  briefly  examine  some 
important  characteristics  of  the  MWWNS.  One  of  the  important  and  useful  character- 
istics of  the  MWWNS  is  that  its  use  is  not  restricted  to  any  symmetry  properties  on  the 
CDF's  of  the  i.i.d.  samples  of 


with  continuous  CDF's  F(x)  and  G(x)  respectively,  with  a number  of  additional  useful 
properties  (see  Appendix  A of  Ref.  18)  under,  H,  (F(x)  = G(x)) 

m ,m 

E[W  * ] = 0 

m ,m 

Var[W  ] = (mj  +"^2  U/3m^m2 

Also,  under  H and  K (F(x)  ^ G(x)) 

m ,m  m ,m 

sup  W ‘ = -inf  W ^ = 1 

F,G  F,G 

m ,m 

With  W (•)  a member  of  the  Savage- Chernoff  class  of  two- sample  linear  rank 

statistics,  it  can  be  shown  to  be  equivalent  to  a general  two- sample  statistic  of  the 
form 

N 

■ X ®Ni^Ni 
1=1 


*4 


1 
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(see  Appendix  A of  Ref.  18)  with  N = + m^,  Z = (Z^,  Z^,  • • • , Zj^)  a set  of  indicator 

variables,  where  Z.  = 1 if  the  i^  random  variable  from  the  combined  ordered  sample 

of  the  X's  and  the  Y's  is  an  X and  Z^  = 0,  otherwise  for  i = 1,  2,  • • • , N,  and  with  scores 

^Ni  ■ asymptotic  theory  of  Chernoff  and  Savage^^  and  its  extensions'^' 

m ,m 

apply  to  W under  both  F(x)  = G(x)  and  F(x)  ^ G(x), 


lim  P 
N— 00 


m ,m  m,,m 

W - EW  ^ ^ 


4 


Var  W 


m,,m. 


= *(t) 


where  ^(t)  is  the  unit  normal  CDF.  The  above  result  along  with  symmetry  properties 
of  CDF  of  Tj^(Z)  makes  this  statistic  ideally  suited  for  the  B-N-L  approach.  Further- 
more,  from  a practical  point  of  view,  it  is  shown  by  Mann  and  Whitney^^  and  Fix  and 
Hodges  that  asymptotic  normality  is  achieved  to  an  accuracy  within  three  places  with 
rrij  = m^  = m = 8.  This  means  that  after  a preprocessing  delay  of  m- steps  (with  m>8) 
we  have  reached  robustness,  as  mentioned  before.  In  addition,  the  MWWNS  is  the  most 
effective  as  measured  by  its  Asymptotic  Relative  Efficiency  (ARE)  under  H when  the 
distribution  has  heavy  tails,  i.e.,  impulse  noise.  With  T^  and  two  different  non- 
parametric  test  statistics  with  n^  and  n^  as  dimensions 


ARE  / = lim  (n  /n  ) 

r 2 n^.n^— « ^ 

with  the  significance  level  and  power  of  the  test  fixed  (compare  ARE  to  RSNR  in  Section 
G).  Intuitively,  this  property  can  be  attributed  to  the  sign  (•)  function.  So,  in  essence, 
the  robust  B-N-L  filter  is  also  a robust  non- Gaussian  filter  due  to  the  m- sample  asym- 
totic  properties  of  the  MWWNS.  It  should  also  be  noted  that  other  two- sample  rank 
statistics  from  the  Ghernoff- Savage  family  could  also  be  used  in  place  of  the  MWWNS: 
however,  the  MWWNS  is  simple  to  implement  and  has  good  (moderate  sample)  asymptotic 
properties  which  are  well  established. 

1 . Implementation  of  the  B-N-L 

F rom  a practical  implementation  point  of  view,  B-N-L  preprocessing  operation 
requires  k = m initial  iterations  of  the  robust  filter  to  obtain  the  m- batch  samples. 

This  is  accomplished  by  initially  storing  the  null  vector  of  length  m in  the  batch  variables 
and  replacing  the  elements  of  the  null  vector  by  shifting  the  data  into  the  batch  variables 
at  each  iteration  step.  Therefore,  while  the  batch  is  "filling  up,  " the  method  requires 
that  we  initially  assume  that  the  distribution  of  the  prediction  density  and  the  nominal 
measurement  noise  density  is  Gaussian,  which  becomes  justified  after  the  preprocessing 
delay  of  m- steps,  due  to  the  m- sample  asymptotic  normality  stated  before.  This 
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assumption  is  identical  to  the  one  used  in  the  L-N  approach,  albeit  the  required  assump- 
tions needed  to  implement  the  L-N  scheme  are  never  completely  justified  since  the 
limiter  operation  merely  removes  outliers  and  does  not  possess  the  required  asymptotic 
properties . 

Now  we  are  in  a position  to  present  the  following  theorem  for  the  case  when  we 
desire  protection  against  measurement  noise  outliers. 

Theorem  2;  Consider  the  linear  model,  Eqs.  (1)  and  (2)  and  make  the  following 
assumptions: 

(Al)  (Xj^  - Xj^)  E Xj^  is  Gaussian  with  mean  zero  and  covariance  matrix,  Mj^  V k, 

(A2)  A sequence  of  diagonal  scaling  matrices  is  known  such  that 

ri‘^^1  ■ ^^ii 


Kk  - K ) 
2 2 2 


K ‘^k  - ^k  ) 

_ r r r _ 


where  satisfies  properties  (PI)  and  (P2)  stated  before,  (•)  is  a (rxl) 

vector  m- batch- integer- rank  operator  with  scalar  components  using  a symmetric  version 

3 

of  the  Mann- Whitney- Wile oxon  statistics 


^ m m 

C(-)  = (l/m  >X, 


where 


'^1  = <^2  = 


j.  = O’  = + 1/3  m 


W of  Eq.  (10)  and  A of  Eq.  (11)  are  independent  of  k.  In  more  complicated  prob- 
lems they  may  be  functions  of  k. 
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Then  the  filter 


Xj^  = + M,_H,  S 


'k*‘k  *k 


(12) 


’"k  ^ ‘h^.k-l’^k-l 

yields  an  unbiased  estimate  of  Xj^  of  the  state  Xj^  with  Mj^  defined  by 

P^“k-“k\^F  t*k'-'l  '^"k^k 

o 

where  E„  [W  (• )]  is  a diagonal  matrix  with  elements 
k 


(13) 

(14) 


“k.  ~ ^v.  - I.  ^ 
1 k.  k. 

1 I 


where  £(•)  is  the  one-dimensional  nominal  Gaussian  density  of  the  components  of 
v^  - will  yield  a bounded  error  covariance  matrix 

Pk  = E[{^  - ^ ^k 


(15) 


for  all  k. 

Proof;  See  Appendix  A of  Reference  18. 

2.  Generalized  Gaussian  Noise  Approximation 

It  follows  from  the  statement  of  Theorem  2 that  the  worst  case  density  no  longer 

exists  when  we  completely  robustize  the  innovation  vector,  so  the  scalar  terms 

a =f  . r- .(0)  can  be  precomputed  for  i = 1, 2,  r.  As  a matter  of  fact,  a simple 
1 VI  - 1 1 

approximation  can  be  derived  in  precomputing  the  a^'s,  such  that  = a = f(0) 

where  f(0)  is  the  nominal  Gaussian  pdf  of  the  innovation. 

Assume  that  the  pdf  of  each  component  of  the  innovation  vector  is  given  by 

8c<'’>  = 2A(c)r(l/r)^^P^-H^I/^<")l‘'> 

where 

A(c)  = [<r^r(l/c)r(3/c)]^/^ 

2 

r(-)  is  the  gamma  function.  The  variance  of  g^(v)  is  cr  and  g^(')  is  a generalization  of 
the  Gaussian  density  function  with  zero  mean  and  variance  , the  resulting  class  of 
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I 


densities  being  symnnetric  and  unimodal  with  different  rates  of  exponential  decay.  (For 
c = 2,  g (•)  is  the  Gaussian  density;  for  c = 1,  g (')  is  the  Laplace  density  and  for 
other  values  of  c,  g^(* ) approximates  a large  class  of  both  thin  and  heavy  tailed  densities.  ) 
Now  in  examining  the  variation  of  g (0)  with  c,  we  find  that  with  1 ^ c < 4,  g^(0)  varies 
only  in  the  range  of  2 to  1 . So  one  would  expect  the  MWWNS  robustized  filter  to  oe  in- 
sensitive to  the  exact  value  of  f(0).  As  a matter  of  fact,  simulation  results  (in  mixture 
noise  with  10%  burst  noise  component  of  variance  64)  indicate  that  with  a 200%  change  in 
f(0)  the  state  variance  of  the  robust  filter  only  changed  0.  1%  (from  0.  12770  to  0.  12634). 
This  result  further  confirms  the  robust  property  and  simplifies  the  construction  of  the 
B-N-L  filter  for  the  contaminated  Gaussian  measurement  noise  family  3[£  f 

3.  Properties  of  the  B-N-L  in  non- Gaussian  and  Asymmetric  Contaminated  Noise 

An  important  result  which  follows  directly  from  examining  the  properties  of  the 
MWWNS  is  that  the  contaminated  measurement  noise  need  not  be  Gaussian  (as  stated  in 
Section  A)  as  long  as  both  the  given  density  function  and  the  contamination  are  sym- 
metrically distributed.  For  the  case  of  asymmetric  contamination,  the  MWWNS  be- 

19 

comes  only  median  unbiased,  as  shown  by  Hodges  and  Lehmann  and,  Xj^  will  no  longer 

be  an  unbiased  estimate  of  the  state  x,  . In  this  case  one  could  increase  the  dimensional- 

k 

ity  of  the  state  vector  by  augmenting  the  state  by  a constant,  i.e.,  = ^k- 1 meas- 
urement equations  by  yj^  = '^k  ^ ^k’  estimate  the  bias,  b,  along  with  the  other 

parameters. 


Therefore,  we  use  the  estimated  state  to  compensate  for  the  unknown  bias. 


A Relative  Comparison  Measure 


In  comparing  the  L-N  and  the  B-N-L  robust  structures,  it  would  be  desirable  to 
establish  a relative  comparison  measure  based  upon  the  relative  sample  number  ratios 
(RSNR)  of  the  respective  approaches  and  possibly  compare  each  approach  with  the 
nominal  minimum  variance  KF.  This  requires  the  computation  of  bounds  on  the  asymp 
totic  variances.  In  the  case  of  the  KF  structure,  this  is  only  possible  for  the  case  of 
constant  coefficients  (the  point  estimation  case)  when  the  solution  of  a Ricatti  equation 
governing  the  state  covariance  becomes  a quadratic  and  the  exact  asymptotic  bound 
can  be  precomputed.  However,  the  solution  is  still  very  difficult,  except  for  simple 
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first  order  systems.  Since  we  are  only  interested  in  a relative  comparison  measure, 
we  take  an  alternate  approach. 

As  illustrated  in  Section  A,  we  evolved  the  robust  approximate  KF  structures  via 
the  robust  RMSA  approach.  So  it  is  quite  natural  to  refer  to  the  asymptotically  equiva- 

5 

lent  robust  RMSA  when  we  examine  asymptotic  behavior.  Evans  has  shown  for  the 
cas«  of  nonparametric  statistics  robustized  scalar  RMSA  with  adaptive  gain  coefficients, 
that  the  asymptotic  variance  of  the  resultant  estimator  satisfies 


Varp(i|;) 

nE|.(ijj') 


A^E^(+') 

ZAE^ir)-  1 


Var  (i|i) 

nE^(4.-) 


where  4^  stands  for  the  nonlinearity  (nonparametric  statistic)  satisfying  the  constraints 
of  Theorem  1 of  Reference  3.  For  the  vector  RMSA  with  adaptive  gain  coefficient  the 
gain  coefficient  becomes  a diagonal  gain  matrix  and  the  nonlinearity  is  applied  component- 

4 

wise  as  shown  by  Kersten  and  Kurz. 


Now,  by  equating  the  lower  bounds  with  the  explicit  assumption  that  the  covari- 
ance matrix  of  the  resultant  asymptotically  normal  pdf  is  diagonal 


n,  V_  . (T.,F)  = n V„  ^ (T  , F) 

i SA..'  t w SA. w’  ' 

11  11 


component- wise  where,  T.  is  the  limiter  influence  function  robustized  RMSA  given  by 
2 

Martin  and  T^  stands  for  the  MWWNS  robustized  RMSA  given  in  Ref.  3,  we  generalize 
the  results  of  Refs.  2 and  3 to  the  vector  case  and  can  solve  for  the  RSNR  with  elements 
{Vg^_}.  In  the  general  case,  an  alternate  measure  can  be  defined  in  terms  of  the  sum 
of  the  diagonal  elements,  i.e., 

f w 


which  can  be  used  for  the  general  covariance  matrix  relations  given  in  Ref.  4 by  neglect- 
ing the  contribution  of  the  off-diagonal  elements  in  determining  RSNR. 

Now  under  the  simplifying  assumptions  of  a diagonal  covariance  matrix,  for  the 
vector  case,  we  obtain  for  the  robust  KF  structures  considered  (for  symmetric  pdf's) 


RSNR(T^/T^)  = 


k^ 

2 + ^ 

3 2 

■ k -2 

J f(z)dz 

3m 

-k 

(16) 


where  f(- ) is  the  pdf  of  the  components  of  the  innovation  vector,  f is  the  light  limiter 
influence  function  with  scale  parameter,  k,  and  m is  the  batch  size. 


482 


SYSTEMS.  CONTROL  AND  NETWORKS 


In  the  scalar  case  with  Gaussian  £(•)  with  unity  variance,  k « 1,  and  Hj^  = I V k 
Eq.  (16)  can  be  approximated  as 

RSNR(T^/T^)=  12/(4Tr)^/^ 

This  result  allows  the  comparison  of  the  filters  in  terms  of  the  computation  time  which 
is  directly  related  to  RSNR.  Comparing  the  performance  of  the  MWWNS  for  two  dif- 
ferent batch  sizes,  m^  = 10  and  m^  = 20 

m m 

RSNR(T  VT  \ for  m>  10 

w w 

which  shows  that  we  reach  the  asymptotic  values  rather  rapidly  independent  of  m,  m < 10. 
D.  Simulation  Results 

A series  of  Monte  Carlo  simulations  and  numerical  computations  were  performed 
to  compare  the  performance  of  the  L-N  and  B-N-L  robustized  filters  with  the  KF  in 
both  Gaussian  noise  and  in  mixture  noise.  Both  the  robust  (L-N  and  B-N-L)  filters  and 
the  KF  were  designed  assuming  Gaussian  CDF's  with  constant  variances  for  the  meas- 
urements, the  plant  noise  and  the  initial  state  estimate,  Vj^  - N(0,  1),  Wj^  - N(0,cr^)  and 
Xq  - N(0,Mq),  respectively.  This  allowed  the  comparison  of  performance  of  the  filters 
when  the  simulated  "unknown"  measurement  noise  was  different  from  Gaussian. 

We  used  the  first-order  system  model  given  in  Ref.  8 

^k  = ’he  + '^k 

with  the  density  of  Vj^  being  the  Gaussian  mixture 

/(v)  = CN(*/£  ,0-)  E (1  - £ ) N(0,  1)  +€  N(0,o-),  0 < £ < 1,  cr  > 1 

to  facilitate  direct  verification  of  the  results  for  the  L-N  case  and  to  serve  as  a basis 
for  comparison  with  the  B-N-L  approach.  An  advantage  of  using  the  above  scalar  model 
is  that  in  this  case  (with  constant  coefficients)  the  asymptotic  variances  can  be  pre- 
computed (because  the  solution  of  the  Ricatti  equation  becomes  a simple  quadratic). 

This  allows  direct  comparison  of  the  Monte  Carlo  results  with  KF  theory. 

The  L-N  filter  was  implemented  (using  Theorem  3 of  Ref.  8),  where  we  used  a 
light- limiter  gain  coefficient  of  k = 0. 317  which  locates  k at  one  standard  deviation  of 
the  Gaussian  innovation  CDF.  In  order  to  normalize  the  comparison  between  the  L-N 
and  the  B-N-L  approaches,  we  limited  the  excursion  of  MWWNS  to  k = i0.317.  For 
the  B-N-L  approach,  we  used  batch  sizes  of  m = 10  and  m = 20  to  verify  the  asymptotic 
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theory  of  MWWNS.  We  investigated  the  performance  of  the  robust  filters  and  the  KF  for 
different  ratios  of  the  plant  noise  and  measurement  noise  variances 


(1  - £ ) + £ CT 


with  £=0.1  and  o-  = 8,  representing  a mixture  noise  with  a 10%  burst  noise  component 
with  variance  64.  The  range  of  input  parameter  values  used  is  listed  in  Table  I.  A 
Monte  Carlo  step  size  of  1000  iterations  was  used  in  all  cases.  We  verified  that  the 


filters  reached  steady  state  by  varying  the  value  of  the  initial  state  covariance,  M^, 


between  0.  1 to  1 and  found  no  discernible  ch^mge  in  the  state  variances  within  six  signi- 
ficant figures . 


TABLE  I.  Simulation  input  parameters. 


Case 

R 

F(v) 

2 

cr 

q 

2 

O’ 

1 

0.01 

CN 

0.073 

64 

2 

1 

CN 

7.3 

64 

3 

0.01 

N 

0.073 

0 

4 

1 

N 

7.3 

0 

Some  of  the  simulation  results  are  summarized  in  Table  II  and  in  Figures  1 
through  5.  We  note  that  the  plots  of  state  estimates  for  cases  Z and  4 (refer  to  Table  I) 

TABLE  II.  Summary  of  Monte  Carlo  simulation  results. 


Case 

L-N  Simulation 

B-  N-  L Simulation 

Kalman  Filter 

M=  10 

M = 20 

Simulation 

Computed 

Mix. 

Norm. 

Mix. 

Norm. 

Mix. 

Norm, 

Mix. 

Norm. 

Mix. 

Norm. 

1&3 

0.0291 

0.0271 

0.0237 

0.0222 

0.0227 

0.0216 

0.1148 

0.0941 

===H 

0.0956 

0.0864 

2&4 

0,2387 

0.2241 

0. 1277 

0.1199 

0.  1100 

0. 1086 

1.0372 

0.8633 

0.5311 

0, 8826 

are  not  shown  since  the  high  variance  plant  and  measurement  noise  created  rather  noisy 
looking  estimates.  As  a matter  of  fact,  cases  2 and  4 (with  r = 1)  represent  the  "high 
Q"  KF  where  we  tend  to  ignore  the  dynamic  state  model  and  place  emphasis  on  the 
measurements.  For  the  "low  Q"  cases  (r  =0.01),  Fig,  3 is  a composite  plot  of  the  state 
estimates  of  the  L-N,  B-N-L  (with  M = 20)  and  the  KF  in  mixture  noise,  illustrating  the 
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Fig.  1.  Residual  variance  comparisons  for  the  L-N  and  B-N-L 
robustized  and  standard  Kalman  filters. 


BATCH  SIZE 

Fig.  2.  Computation  time  comparison  for  the  L-N  and  B-N-L 
robustized  and  standard  Kalman  filters. 

robustness  of  the  B-N-L  approach  via  the  MWWNS.  Figures  4 and  5 are  the  individual 
plots  of  the  state  estimates  in  mixture  noise  for  the  KF,  L-N,  B-N-L  with  M=  10, 
B-N-L  with  M = 20  filters,  respectively. 

It  follows  by  inspection  of  Table  II  that  the  B-N-L  filter  outperforms  the  L-N 
filter  in  all  cases  which  is  illustrated  in  Fig.  1 by  comparing  the  Monte  Carlo  state 
variances.  We  also  see  from  Fig.  1 that  the  B-N-L  filter  state  variance  reaches  al- 
most asymptotic  value  with  a batch  size  between  6 and  10  and  the  per  cent  difference 
(for  case  1,  "low  Q"  in  mixture  noise)  between  the  state  variances  for  M=  10  and  M = 20 
is  4%.  For  case  3 ("low  Q"  in  Gaussian  noise)  the  comparable  difference  is  2.8%. 
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ITERATION  STEP 


Fig.  3.  A composite  comparison  of  state  estimates  for  the  L-N  and 
B-N-L  robustized  and  standard  Kalman  filters  in  mixture 
noise . 

These  results  compare  within  10  to  1 to  the  theoretical  value  of  0.37%.  However,  the 
above  results  are  in  good  agreement  with  theory  when  we  compare  them  with  the  varia- 
tion between  the  computed  and  Monte  Carlo  state  variances  for  the  KF  of  2.2%  for  case 
3 and  95%  for  case  2,  respectively.  The  robustness  property  is  also  apparent  from 
Table  II  for  the  B-N-L  filter.  The  difference  between  cases  1 and  3 for  the  L-N  filter 
is  9%  while  the  corresponding  difference  for  the  B-N-L  filter  is  only  3%  or  the  latter 
approach  is  more  robust. 

A comparison  of  the  relative  computation  time  for  the  robust  and  standard  Kalman 
filters  on  the  IBM  360/67  time- sharing  computer  is  shown  in  Figure  2.  We  see  that 
while  batch  processing  requires  approximately  1 . 2 to  2 longer  time  than  KF  or  the  L-N 
filter,  its  robustness  properties  and  ease  of  implementation  outweigh  its  added  cost  in 
running  time  (compare  Figures  1 and  2). 

In  summary,  the  Monte  Carlo  simulation  of  the  robust  L-N,  B-N-L  and  Kalman 
filters  substantiated  the  theoretical  results  and  demonstrated  the  advantages  of  the 
batch-nonlinear-linear  filter  structure  using  the  MWWNS. 
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Fig.  4.  State  estimate  for  the  standard  Kalman  filter  in  mixture  noise. 
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FIGURE  4.5.7  state  estimate  for  the  b-n-l  robustized  filter  in  mixture  noise,  m=20 
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FIGURE  4.5.6  state  estimate  for  the  b-h-l  robustized  filter  in  mixture  noise,  m>=10 
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Fig.  5.  State  estimate  for  the  L-N  robustized  filter  in  mixture  noise. 
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ROBUSTIZED  SCALAR  FORM  OF  GLADYSHEV'S  THEOREM  WITH  APPLICATIONS 
TO  NONLINEAR  SYSTEMS 

I.  Kadar  and  L.  Kurz 

Gladyshev*  in  1965  gave  a simplified  proof  of  the  scalar  Robbins-Monro  Stochastic 
Approximation  (RMSA)  procedure  permitting  the  restrictions  imposed  on  the  unknown 
regression  function  to  be  weakened  somewhat,  extended  convergence  results  in  the 
spirit  of  Sacks  and  proposed  a theorem  (without  proof)  which  is  an  extension  of  RMSA 
to  least  squares. 

The  emphasis  in  this  report  is  on  the  unproven  portion  of  Gladyshev's  theorem, 
robustizing  it  along  the  lines  suggested  by  Evans,  Kersten  and  Kurz^  and  application  of 
the  results  to  problems  of  current  interest,  i.e.,  an  interferometer  used  for  space- 
craft altitude  control  which  measures  angle  of  arrival  (AOA),  a correlator  receiver,  a 
tone- ranging  system  which  measures  time  of  arrival  (TDA),  etc. 

Essentially,  the  theorem  is  a statement  of  the  orthogonality  principle  in  the  RMSA 
framework  which  is  termed  Stochastic  Approximation  Minimum  Variance  Least  Squares 
(SAMVLS).  The  theorem  is  robustized  via  the  batch-nonlinear- linear  (B-N-L)  and 
linear-nonlinear  (L-N)  approaches. 

A.  RMSA  Extension  to  Least  Squares  (SAMVLS)  and  Small  Sample  Theory 

The  following  theorem  establishes  the  asymptotic  convergence  of  RMSA  to  mini- 
mum variance  least  squares: 

Theorem:  Let(|,r|),  (^j,t)j),  •••,  be  a sequence  of  two-dimensional  real  random 
vectors,  E[^  ] < oo,  E[r|  ] < oo.  Further,  let  x j , x^,  • • • , be  a random  sequence  in 
which  Xj  is  an  arbitrary  real  random  variable  and  x^.x^,  •••,  are  determined  from 

n+I  n ' n 'n'^n 

where  A(-)  > 0,  then  the  sequence  {x^}  converges  w^l  to  a value  0 which  turns  the 
expression  E[(|e-r))^]  into  a minimum.  If,  in  addition  AE[^  > j,  the  sequence 
Vn'(Xn  - 0)  is  asymptotically  normal  with  mean  zero  and  variance 

= A^(2AE[e^]  - 1)'*  E[0e^-0E[e^]  - e Ti  +E[e  n]  ] 

Proof; 

If  we  recognize  that 

M(x^:0)  = E[Y(x^;0)]  =E[(e„x^-il^)y 

then  the  proof  follows  a similar  procedure  as  in  Theorem  B1  of  Reference  12.  Because 
of  the  length  and  complexity,  the  details  of  the  proof  are  omitted. 
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Since  in  practice  one  always  deals  with  a finite  sample  size,  the  question  natural- 
ly arises  as  to  how  well  asymptotic  theory  approximates  the  true  situation.  We  ex- 
amine Gladyshev's  extension  of  RMSA  to  least  squares  in  the  small  sample  case  (with 
constant  gain  coefficients).  This  illustrates  the  significance  and  the  need  for  adaptive 
gain  coefficients  in  terms  of  efficiency  of  the  procedure. 


With  the  linear  regression  function  of  the  form  M(x  ; 0)  = or.  (x  - 0)  + b where 
2 n in  ti  n 

a.  = E[e  ^]  , b = -E[e  q ] , Y(x  ; 0)  = M{x  ; 0)  + Z{x  ) and  Z{x  ) = ^^x  - ^ q - 
in  n n 'n-*  n n n n ^n  n n 'n 

E[4  x^  ^n  ~ which  merely  shifts  the  intercept  of  the  regression  function. 

Furthermore,  we  introduce  a sequence  of  i.i.d.r.v.  satisfying 

E[e  ] =0,  E[e  ^1=1=  var  e 
'•  n-*  ' n-*  n 


Z(x  ) = £ p 
' n n 

such  that 

E[Z(x^)]  = 0 and  E[Z^(x^)]  = 

X —0 

n 

as  required  in  the  Theorem. 

We  estimate  the  root  0 of  M{x)  by  the  RMSA  method,  i.e.,  choose  x^  arbitrarily 
and  x^,x^,  • • • , are  calculated  from 

X = X -a  [tt,(x  -0)  +e  pi 
n+1  n r n ' 


which  is  the  expected  bias  at  step  n.  It  gives  a measure  how  far  down  the  sequence  an 
inappropriate  choice  of  Vj  biases  the  results. 


I 


490 


SYSTEMS,  CONTROL  AND  NETWORKS 


The  variance  of  V is  given  by 
n 


, n-  1 


Var(V^)  = Var(x^)  = p ^ 

i-  1 


n-  1 

f ^ 

k=i+i 


1 

2 

J “li 


(3) 


Thus,  Eqs.  (l)-(3)  give  the  exact  bias  and  variance  of  the  estimate  of  x at  any  step  n 

*1  a/k 

of  the  estimation  process.  One  can  easily  show,  by  letting  (1-  a/k)  » e , that 


lim  EfV  1 =0 
n— *co 


(4) 


and 


lim  Var(V^)  = p^Q'^/[n  0^(2  a - 1)] 
n— oo 


(5) 


Now  if  we  redefine  a^  = A/n,  where  A = [a/E(4  )]  . we  obtain  the  desired  form  of  the 
result 


2.2 

lim  Var(V  ) 

n— oo  " n[2AE(4  )-l] 


, AE[e^]  > \ 


(6) 


The  results  of  Eqs.  (4),  (5)  and  (6)  show  that  under  certain  conditions  on  AE[^  ],  the 
biases  arising  from  bad  initial  guesses  rapidly  tend  to  zero  with  increasing  sample 
size,  and  the  expected  squared  error  of  x^  is  of  order  1/n. 

Sacks^  showed  under  general  conditions,  with  a^  = (A/n),  x^  is  asymptotically 

normal  with  variance  given  by  Equation  (6).  A suitable  choice  for  A is  to  minimize 

-1  2 

Equation  (6).  This  leads  to  choosing  A = E ) where 


lim  Var(V^)  = pVnE(e^) 
n^co 


(7) 


asymptotically.  If  we  assume  that  Y(x;  0)  is  normally  distributed  with  mean  E ](x-  0) 
and  variance  we  have 


1/E[8  logf(x;  e)/ae]  = p^Efe 


(8) 


and  the  Cramer- Rao  lower  bound  on  the  variance  of  estimates  of  0 is  Equation  (8).  The 
amount  of  Fisher  information  for  the  above  case  in  a sample  size  n is  given  by 


>I[f;  0=0]  E n f(x)dx 


0=0 


f{x) 


or  the  Fisher  information  per  observation  is  E(4  )/p^ 
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A more  general  result  holds  from  the  following  considerations.  The  Gladyshev 

form  of  the  estimator  can  be  fully  efficient  (asymptotically)  if  A = E ^).  The  prac- 

-1  2 

tical  difficulty  arises  that  E ) is  unknown  and  must  be  guessed  or  estimated.  This 
is  precisely  why  adaptive  gain  coefficients  are  introduced  to  estimate  the  slope  (near 
the  root)  of  the  regression  function  at  every  step.  However,  the  efficiency  of  this 
process  is  weakly  dependent  on  the  correct  choice  of  E ^).  The  asymptotic  efficiency 
is  Eq.  (7)  divided  by  Eq.  (6)  which  yields 


Eff  E (2  A E(i  - 1)/A^E^(| ) , A E(^  ^)  > ^ 


(9) 


A plot  of  Eq.  (9)  shows  a broad  maximum  and  a tendency  toward  zero  efficiency  at 

2 

A E(|  ) = l/2  in  a very  gradual  manner.  This  has  been  substantiated  by  Hodges  and 
6 2 

Lehmann  who  have  shown  that  for  AE(|  ) = l/2  the  asymptotic  variance  is  given  by 


Var(V^)  = 


2 , 

P logn 

4nE^(e^) 


and  the  efficiency  becomes  4/(logn  + y)  where  y is  the  Euler's  constant. 

4 

In  terms  of  the  requirements  for  the  estimator  of  the  adaptive  gain  coefficient 

- 1 2 

A^(Xj^,  • • • f the  above  result  points  out  that  an  exact  estimate  of  E (^  ) (in  general, 
the  slope  of  the  regression  function)  is  not  needed  at  every  step  because  of  the  broad- 
ness of  the  efficiency  curve.  The  conditions  required  for  the  estimator  of  the  adaptive 
gain  coefficient,  A^(")are:  E[A(*)]“*A,  E | A^  - E(A^(  • ))  | — 0 and  E[  A^(  • )]  -*•  A as 

n -*  oo;  in  addition,  A in  the  neighborhood  of  E ^(4  ^)  will  give  high  efficiency  at  each 
step. 

B.  Minimax  Theory  Applied  to  Rank  Tests  and  SAMVLS 

In  robustizing  SAMVLS  two  approaches  are  possible:  Huber's^  light  limiter  in- 

4 

fluence  function  (LLIF)  or  M-estimators  and  rank  statistic  preprocessor,  or  R-esti- 
mators.  In  this  section,  the  mathematical  background  which  permits  meaningful  com- 
parison of  these  two  approaches  is  presented. 

7 

Let  Xj^,  x^,  • • • , x^  be  i.  i.  d.  r . V.  withc.d.f.  F(x  - 0)  and  p.  d.  f . f(x-6).  Huber 
proposed  a maximum  likelihood  estimator  of  0 called  an  M- estimate,  which  satisfies 

w 

Yj  4>(x.  - M)  - 0 , 

i=  1 


where  4<(x)  is  such  that  il'i'x)  = -4<(x).  If  4'(x)  is  monotonic,  M is  essentially  uniquely 

determined.  Huber^  has  shown  that  the  M- estimator  is  asymptotically  normal  with 

l/2 

mean  0 and  asymptotic  variance  of  n M. 


1 
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V 


M 


[/4,'(x)f(x)dx]2 


This  theory  is  useful  in  the  study  of  mixture  distribution  noise  of  the  form  F = (1-  c )G  +t:H, 
where  0 <^  £ £ 1 is  a fixed  number,  G is  a fixed  and  symmetric  distribution  and  H is  a 
variable  c.d.f.  Huber  demonstrated  that  there  exists  a family  of  distributions  which 
includes  the  mixture  distribution  for  which  there  exists  a saddle  point  V (4/  ,F  ) sat- 
isfying 

y^(^.  F ) < V^{4>  . F ) = 1(F  J < V^(4-  . F) 


The  least  favorable  density,  f^,  is  Gaussian  in  the  middle  and  double- exponential  on  the 
tails.  The  M- estimator  is  the  corresponding  maximum  likelihood  estimator  with 
= -fVf^.  The  associated  4^^  is  given  by  the  LLIF  ^qIx)  = which  satisfies  the 

conditions  of  the  minimax  result. 

8 

Jackel  has  shown  that  the  minimax  theory  established  by  Huber  also  applies  to 
the  estimators  derived  from  rank  tests,  as  well  as  to  estimators  based  upon  linear 
combination  of  order  statistics  with  the  asymptotic  variances  of  the  three  classes  of 
estimators  being  identical.  With  rank  tests  of  the  Chernoff- Savage  form 

N .L/  N.  N. 

1=1  1 i 

with  scores  a.,  = J (i/N),  we  define  the  R- estimator  as  the  solution  of  J.r(R)  = 0 under 
INf  N 1/2  ^ 

the  hypothesis  with  m = n,  N = 2w.  The  asymptotic  variance  of  n R is  then 


Vj^(4-,F)  = 


/j^(t)dt 


t/  i {J[F(X)]}  f(x)dx]2 


(10) 


and  with  t = F(x),  J(t)  = 4;(x),  V.  .(F)  = V (F)  asymptotically.  We  note  that  the  numerator 
of  does  not  depend  on  F unlike  the  numerator  of  This  indicates  that  the  R- 

estimator  will  more  likely  be  more  robust  than  the  M- estimator.  This  observation  will 
be  further  supported  in  subsequent  discussion  and  simulations. 

Applying  the  minimax  theory  for  the  M- estimator,  the  R- estimator  corresponding 
to  the  M- estimator  is  defined  by 

J„(t)  ^ ^ Jx) 
with  F defined  by 

f^(x)  = (1  - £ ) g(-x^)  exp[k(x +X^)l  for  x<-x^ 


= (1  - £ ) g(x) 


for  -X  < X < X 
o o 


(1 -£  )g(x^)  exp[-k(x- x^)]  for  x>-x^ 


*4 


J 
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for  some  and  k depending  on  C and  t . = -f^(X) /f^(X),  a generalized  LLIF,  is 

monotonic,  and  for  X < -X  and  X > X is  a constant,  and 

o o ' 


This  result,  along  with  the  minimax  bound  for  the  M- estimator,  allows  us  to  ob- 
tain lower  bounds  for  the  asymptotic  variance  of  the  robust  SAMVLS.  If  we  define  a 
general  nonlinear  odd- symmetric  robustizing  transformation,  T satisfying  the  require- 
ments of  the  theorem  in  Section  A,  which  may  represent  either  a nonparametric  rank 
statistic  or  the  LLIF,  then  the  asymptotic  variance  of  the  robust  SAMVLS  is 


V - Vgr  T(-) 
SA  2 

"“l 


a2  2 
^ ^1 
2 A a ^ - 1 


> Var  T(.) 
n a j 


where 


( ) . Var,Tl-)<  .F  ) 


and 


aj  = {Ej.[  t'(-)]}^ 


Two  types  of  nonparametric  statistics:  a two- sample  Mann- Whitney- Wilcoxon 
(MWWNS)  of  the  form 


=^  Z,  t, 

1=1  j=i  ■' 

and  Wilcoxon  one- sample  symmetrical  (WSRNS)  of  the  form 
sgn(x.  - X.)  + ^ sgn  x 

j ‘ J j=l  J 

are  particularly  useful  as  B-N-L  robustizers  of  SAMVLS.  While  requiring  storage,  the 
batch  nonparametric  statistic  preprocessed  transformation  achieves,  with  a moderate 
sample  size  of  m,  approximate  asymptotic  normality,  with  a constant  variance,  inde- 
pendent on  the  underlying  CDF. 


Unfortunately,  this  is  not  the  case  for  the  light  limiter  influence  function  robustiz- 
ing transformation,  where  the  limiter  merely  removes  outliers.  Furthermore,  the 
asymptotic  normality  and  robustness  properties  of  the  corresponding  M- estimator  are 
only  applicable  asymptotically  since  in  the  case  of  the  LLIF  transformation  there  is  no 
batch  preprocessing  corresponding  to  the  MWWNS  or  the  WSRNS. 
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C»  A Comparison  of  Robustizing  Methods  with  Applications  to  Nonlinear  Systems 

In  the  previous  sections  we  have  shown  that  the  nonparametric  rank  statistics, 

(i.e.  , MWWNS  and  WSRNS)  and  the  LLIF  robustizing  transformations  applied  to  RMSA 
are  asymptotically  equivalent.  The  main  differences  are  in  the  moderate  sample 
asymptotic  behavior  and  in  the  rate  of  convergence. 

In  order  to  compare  the  performance  of  the  robust  SAMVLS  under  different  ro- 
bustizing transformations,  we  assume  that  the  noise  is  described  by  the  mixture  distri- 
bution model.  This  allows  us  to  select  the  parameters  of  the  robustized  recursion  under 
the  nominal  CDF's  and  determine  the  degree  of  robustness  in  the  presence  of  contamina- 
tion. One-dimensional  applications  oriented  examples  drawn  from  practical  systems 
illustrate  the  above  approach. 

To  determine  time-of-arrival  (TOA)  or  to  measure  the  angle- of- arrival  (AOA) 

4 

and  in  other  application  areas,  many  sensors,  navigation  and  communication  systems 
utilize  correlator-discriminators  and  phase  detectors.  These  devices  compare  the 
difference  between  possibly  nonlinear  functions  of  two  signals  in  additive  noise  to  measure 
the  unknown  signal  parameters.  One  signal  is  usually  taken  as  the  reference  (i.e., 
known)  and  the  parameter  measured  (i.e.,  unknown)  is  derived  with  respect  to  that 
reference. 

The  generic  characteristics  of  these  devices  is  usually  nonlinear,  their  region  of 
operation  is  usually  restricted  to  a linear  portion  of  the  characteristics.  The  output  of 
these  devices  is  a noisy  error  signal  which  is  linearly  related  to  the  difference  between 
the  two  signal  parameters  which  drives  a control  loop  to  adjust  the  reference  signal 
with  respect  to  the  incoming  signal  in  a direction  to  drive  the  error  (i.e.  , the  difference) 
to  zero. 

Having  sampled  values  available  of  such  systems  in  a digital  implementation,  we 
have  the  exact  framework,  with  the  appropriate  identification  of  terms,  for  RMSA 
applied  to  least  squares  (SAMVLS).  Three  examples  for  such  systems  are  given;  an 
interferometer  used  in  spacecraft  attitude  control  to  measure  AOA  (see  Ref.  9),  the 
discriminator  portion  of  a correlation  receiver  used  to  measure  relative  TOA  (see 
Ref.  10),  and  a CW  phase-comparison  ranging  system  (see  Reference  11).  We  assume 
that  the  systems  under  consideration  are  bandlimited  and  are  narrow  band  with  respect 
to  some  center  frequency.  The  output  of  these  systems  is  sampled  for  digital  proces- 
sing at  a rate  slow  enough  to  have  negligible  cross-correlation  between  the  signal  and 
noise  samples,  i.e.,  the  samples  are  i.i.d. 

In  the  systems  under  consideration,  the  noise  is  modelled  as  a stationary  narrow 
band  process.  The  probability  density  function  of  the  phase  of  the  sinusoidal  signal  plus 
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narrow  band  noise  is  given  by 

,2 


U 


fi  \ exp(-A  2R(o))  A cos/v)  expf- A sin  (y)  2Rio)] 


R(o) 


for  I y I < TT  and  zero  elsewhere,  where  A is  the  signal  amplitude,  R(o)  = cr  noise.  If 
the  input  signal-to-noise  ratio  (SNR)  is  high  in  both  channels  (e.g.,  interferometer), 

SNR  = A^/2R(o)  and  we  limit  the  range  of  angle  measurement  to  | y | < 5°  (i.e.  , 
cos  y = 1,  sin  y = y),  then  the  expression  for  f(y)  reduces  to 

1,  1,1  < 5» 

where,  for  notational  convenience,  we  let  = SNR,  and  f(y)  is  Gaussian  with  zero  mean 
and  variance  l/2S  . The  sinusoidal  signal  plus  narrow  band  noise  appearing  at  the  in- 
puts to  the  phase  detector  are  multiplied  and  the  difference  between  their  phases  (hav- 
ing removed  the  sum  frequency  terms)  is  given  by  <()  = n - p +v  where,  v is  Gaussian 
with  mean  zero  and  variance  cr  . 

Consider  the  SAMVLS  to  adjust  n = p to  reorient  a spacecraft  or  to  measure 
relative  range,  with  sequential  measurements  available.  In  the  case  of  the  correlator- 
discriminator  this  is  exactly  equivalent  to  aligning  the  epoch  of  the  replica  code  with 
the  incoming  code  to  determine  relative  TOA.  For  the  phase  comparison  ranging 
system,  this  is  equivalent  to  fine  alignment  of  the  reference  tone  phase  to  match  the  in- 
coming tone  phase.  In  terms  of  the  SAMVLS,  these  operations  can  be  defined  with 

£ = 1 and  \{a  ; 0)  = a - 0 +v  . The  SAMVL  recursion  is  then  of  the  form 
^ ' n n n 


“n+1  = "n'  (A/n)[y(a^:  6)] 


(11) 


To  robustize  Eq.  (11)  one  introduces  either  batch  preprocessing  and  a nonpara- 
metric  rank  statistic  or  the  LLIF-  The  output  of  either  combination  is  then  applied  to 
y(o  : 0).  The  MWWNS  robustized  SAMVL  recursion  is  of  the  form 


n+1 


= Q - 
n 


^2  Zj  Zj®®"^“'[m(n-l)  + i]  ■®[m(n-l)+j]  ''■'^[m( 


n-l)  + i] 


The  WSRNS  robustized  SAMVL  recursion  is  of  the  form  with 

4>-  - j.-'®  j.-  j.- 

1 n+i  n+i  n+i 


Q a - (A  /n)s'^(<t.) 

n+1  n s ' 


where 


«4 


■n 
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( ) = y sgntX.  + X ) + ^ sgn(X  ) 
i>j  J i = l 


The  LLIF  robustized  SAMVLS  recursion  is  of  the  form 
‘^n  + l = 

where 

X . I X I < k 


sgn(x),  I X I > k 


To  evaluate  the  optimum  gain  coefficients  for  the  above  recursions,  we  have  to  find  the 
respective  regression  functions  in  a form  which  satisfies  theorem  1 of  Ref.  4 and 
evaluate  the  derivative  of  the  regression  function  at  the  root. 


In  reference  to  the  theorem  of  Section  A,  the  parameters  of  the  MWWNS  robustiz- 
ed recursion  are  with 


( ) = — E.  E Y ) 

m 1 = 1 j = l ■’ 


W‘ 


and  define 


u"'(  ) 2 E U(X.-  Y ) 

m i = l j = l ■* 

such  that  W = 2 U - 1;  then  it  can  be  shown  that 

r 

\2  I f (z/a  ,e)dz-  1 

■J  x-y 


E W = 


where  f^  ) is  the  jjdf  of  difference  between  independent  variables  (X.  - Yj). 


The  reciprocal  of  the  optimum  gain  coefficient  = l/A  is  given  by 


E W = Q^(a-  0)  + 0(  I a - 0 I ‘^) 


where 


oo 


= 2f  (o) 

a=e 


and 


K -V  = K +v 
''l  2 '"l^Z  ' 


** 
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since  V is  an  i.i.d.  sequence  of  symmetric  Gaussian  r.v.  's.  The  asymptotic  variance 
of  the  WMWNS  robustized  recursion  is  given  by 


W W 
V = V 
SA  R 


V 


4A/(V4iror  ) - 1 


W 


where 


n{l/4Tr  0-  ) 


I 

L 


For  a WSRNS  robustized  SAMVLS  proceeding  in  a manner  analogous  to  the  MWWNS, 
details  of  which  are  covered  in  Ref,  4 

o®  = 2[m(m-  l)f^^_l^^^(o)  +mf^(o)] 
where  under  the  nominal  Gaussian  CDF  for  v. 


1 


f (o)  = 

^l+'^Z  V47<r 


and  f (o)  = 

V 


2ir  cr 


The  asymptotic  variance  of  the  WSRNS  robustized  SAMVLS  is  given  by 


'^SA  = v|[A^of)y(2Aaf-  1)]  > V 


where 


V®  = [(l/24)(m+l)(2m  + l)/n(a®)^]  < F^) 

I,,  the  LLIF  robustized  SAMVLS,  in  reference  to  the  theorem  in  Section  A,  the 
salient  parameters  of  the  robust  recursion  are  with  t|  = 0 +v 


%JLf 

Ef^  Y{tt.0')  = / lj^{o  - 0')f^(a  - 0)da' 


-oo 


and 


= f f (x)  dx  = [ 1 - 2 erf(k)] 

1 1 V 

-k 

if  f (x)  is  Gaussian.  The  asymptotic  variance  of  the  LLIF  robustized  SAMVLS  is  given 
by 

'^SA  = 


-2  *4. 
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where 


/ <^(x)f^(x)dx 

/ L.2 

n(aj  ) 


A series  of  Monte- Carlo  simulations  were  implemented  to  evaluate  the  perform- 
ance of  each  robustized  SAMVLS  recursion,  as  well  as  the  performance  of  unrobustiz- 
ed  SAMVLS  in  Gaussian,  N(0,  1)  and  in  mixture  noise.  Two  models  were  used  for 
mixture  noise.  In  the  first  model  the  nominal  Gaussian  noise  N(0,  1)  was  contaminated 
with  a 10%  high-variance  Gaussian  burst-noise  component,  i.e.,  0.9N(0,1)  +0.1N(0,8). 
In  the  second  case,  the  Gaussian  burst-noise  component  was  replaced  with  2%  Cauchy, 
CN{0,  64),  i.e.,  0.98N(0,1)  + 0 . 02  CN(0,  64) . The  Monte  Carlo  noise  generator  char- 
acteristics are  summarized  in  Table  I for  a step  size  of  1000  and  10,000  iterations. 

TABLE  I.  Monte  Carlo  noise  generator  characteristics. 


Measurement  Noise 
Simulated 

Mean 

Standard 

Deviation 

Minimum 

Maximum 

Number 
of  Samples 

1 +N(0,  1) 

1.01319 

1 .00455 

-2.89972 

4. 8185 

10,000 

1 +0.9N(0,  1)  + 

0.  1N(0,  8) 

-3.78762 

0.90516 

-7.30975 

0.6424 

10,000 

1 +0.98N(0,  1)  + 
0.02CN(0,64) 

2.03288 

79.29695 

- 1251 . 17603 

5079.6875 

10,000 

1 +N{0,  1) 

1.06148 

0.97842 

-2.89972 

4.5472 

1,000 

1 +0.9N(0,  1)  + 

0.  1N(0,  8) 

-3.73960 

0.89117 

-7.30975 

0.6424 

1,000 

1 +0. 98N(90,  1)  + 
0.02CN(0,  64) 

1.04001 

23. 10217 

-388. 38037 

247. 1582 

1,000 

Note  that  even  in  the  nominal  Gaussian  case,  there  is  a slight  positive  bias  in  the  noise 
generator.  For  the  MWWNS  and  the  WSRNS,  a batch  size  of  m = 10  was  selected,  which 
allowed  comparison  of  the  convergence  rates  between  the  two  batch- integer- rank  pre- 
processed  SAMVLS. 

First,  the  simulations  were  computed  for  the  Gaussian  mixture  case  with  a 10% 
burst-noise  component.  The  simulation  results  for  a step-size  of  1000  iterations  are 
summarized  in  Table  II  and  in  Figures  1 and  2.  Note  that  the  ratios  of  the  sample 
variances  in  Table  I between  the  MWWNS  robustized  and  WSRNS  robustized  SAMVLS  is 
1.898,  which  is  very  close  to  the  theoretically  predicted  value  of  -2.  Figure  1 shows 
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TABLE  II.  Monte  Carlo  simulation  comparison  summary. 

(1,000  samples),  scalar  SAMVLS  in  mixture  noise. 


Robustizing  Method 

Mean 

Variance 

Minimum 

Maximum 

Batch  Size 

MWWNS  (Two  Sample) 

1.01438 

0.07239 

-1.32851 

1.15423 

10 

WSRNS  (One  Sample) 

0.96122 

0.03815 

-0.55250 

1.03668 

10 

LLIF  (k  =0. 317) 

0.90488 

0.00344 

0.31700 

0.94291 

— 

Unrobustized 

1. 12588 

0.00404 

0.64237 

1.48352 

— 

Fig.  1.  Unrobustized  scalar  SAMVLS  in  Gaussian  and  mixture  noise. 

the  performance  of  SAMVLS  without  any  robustizing  transformation  in  nominal  Gaussian 
and  mixture  noise.  Without  robustizing,  the  convergence  of  the  recursions  is  slow. 

The  performance  of  the  robustized  scalar  SAMVLS  recursion  is  shown  in  Figure 
2.  Note  that  both  the  WSRNS  and  MWWNS  reach  true  values  after  200-300  iterations. 
However,  the  LLIF  is  biased,  as  shown  in  Figure  2.  While  converging  fast  with  a 
small  sample  variance,  it  converges  to  a biased  value,  which  depends  on  the  nonlinearity 
and  the  bias  cannot  be  predicted.  This  demonstrates  the  drawback  of  the  L-N  approach 
via  the  LLIF.  At  the  same  time,  the  WSRNS  and  the  MWWNS  converge  fast  to  the  true 
parameter  values. 
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Fig.  2.  Robustized  scalar  SAMVLS  in  mixture  noise. 
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Current  work  in  Signal  Processing  involves  one  and  two-dimensional  determin- 
istic and  random  signals.  The  following  are  brief  descriptions  of  recent  results  on 
various  aspects  of  the  program.  Details  can  be  found  in  the  stated  references. 

A.  ' The  Two- to- One  Rule  in  Data  Smoothing^ 

If  a signal  is  estimated  by  a weighted  rverage  of  the  data  in  the  interval  (t-c,t  + c), 

2 

then  the  variance  cr  of  the  estimate  decreases,  but  its  bias  b increases,  with  increas- 
ing c.  It  is  shown  that  in  high  accuracy  estimates,  the  mean- square  error  e is  mini- 
mum if  c is  such  that  a =2b,  regardless  of  the  form  h(t)  of  the  smoothing  weight. 
Furthermore,  the  resulting  e^  is  minimum  if  h(t)  is  a truncated  parabola. 

2 

B.  Generalized  Sampling  Expansion 

It  is  shown  that  a bandlimited  function  f(t)  is  uniquely  determined  in  terms  of  the 
samples  gj^(nT)  of  the  responses  gj^(t)  of  m linear  systems  with  input  f(t),  sampled  at 
l/m  the  Nyquist  rate.  Various  known  extensions  of  the  sampling  theorem  follow  as 
special  cases  of  the  resulting  generalized  sampling  expansion  of  f(t). 

3 

C.  The' Zero- Crossing  Problem  in  Deconvolution 

In  the  determination  of  the  transform  X(u))  of  a signal  x(t)  in  terms  of  the  function 
y(t)  and  the  sum  w(t)  = x(t)*y{t)  + n(t),  the  presence  of  the  term  n(t)  (noise)  introduces 
large  errors  in  the  vicinity  of  the  zero-crossings  of  Y(u)).  A method  is  presented 
for  reducing  these  errors.  The  method  is  based  on  the  fact  that  the  frequency  compo- 
nents N(a)j^)  of  the  noise  process  n(t)  equal  the  frequency  components  W(u)j^)  of  the  data 
process  w(t).  This  leads  to  an  estimate  of  N(oj)  for  all  cj  resulting  in  a reduction  of 
the  errors  particularly  in  the  vicinity  of  co^. 

D.  The  Factorization  Problem  for  Time- Limited  Functions  and  Trigonometric 
Polynomials^ 

The  following  form  of  the  factorization  problem  is  considered:  Given  a function 
g(t)  vanishing  for  ( t|  > a and  with  non-negative  Fourier  transform  G(ju(),  find  a func- 
tion f(t)  with  energy  spectrum  G(jtL>)  and  such  that  f(t)  = 0 outside  the  interval  (0,a).  A 
numerical  method  for  determining  f(t)  is  developed  involving  only  discrete  Fourier 
series . 

5 

E.  Identification  of  Systems  Driven  by  Non-Stationary  Noise 

The  system  function  of  an  unknown  system  can  be  determined  in  terms  of  the 
power  spectrum  of  the  output  if  the  input  n(t)  is  a random  process  with  known  properties. 
This  well-known  method  of  system  identification  is  based  on  the  assumption  that  n(t)  is 
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stationary.  In  this  paper,  it  is  shown  that  a system  with  time-limited  impulse  response 
can  be  identified  in  terms  of  the  running  spectrum  of  the  output  even  if  the  input  is  non- 
stationary. The  analysis  includes  the  evaluation  of  the  autocovariance  of  the  amplitude 
spectrum  and  the  energy  spectrum  of  a random  process.  The  variance  of  the  estimates 
is  determined  under  the  assumption  that  the  input  process  is  normal.  However,  the 
results  hold  for  a more  general  class  of  inputs  if  the  time  of  observation  is  sufficiently 
large. 
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DIFFRACTION  ELIMINATION  FOR  POINT  SOURCES 
A.  Papoulis  aad  C.  Chamzas 

A method  is  developed  for  recovering  the  precise  location  and  intensity  of  point 
sources  in  terms  of  their  image  greatly  distorted  by  diffraction  effects  and  noise.  ' 

The  method  is  based  on  an  iteration  scheme  that  restores  the  missing  high  frequencies 
of  the  unknown  object. 

Consider  a one-dimensional  optical  system  with  amplitude  spread  function  h(x). 

3 

With  the  usual  assumptions,  the  amplitude  g(x)  of  the  image  of  a coherent  object  is 
given  by 

00 

g(x)  = / f(x-  i)h(i)di  (i) 

.00 


where  f(x)  is  the  amplitude  of  the  object. 

Using  capital  letters  for  the  Fourier  transforms  of  the  above  functions,  we  con- 
clude from  Eq.  (1)  that 

G(u)  = F(u)H(u)  (2) 

The  case  which  we  are  going  to  examine  here  is  when  the  object  is  a double  or  multiple 
star  and  the  optical  system  a telescope.  Hence,  if 


d = 2a  (3) 

is  the  diameter  of  the  lense  of  the  telescope  then^  the  Fourier  transform  of  the  spread 
function  h(x)  is  (see  Fig.  1) 


H(u) 


1 I u I < a 
0 I u I > a 


(4) 


and  the  amplitude  of  the  object  f(x)  is  given  by 


K 

f(x)  = a^6{x-  x^) 


(5) 


From  the  properties  of  Fourier  transform  it  follows  that  h(x)  is  of  infinite  extent.  Thus, 
a consequence  of  Eq.  (4)  is  the  diffraction  effect,  yielding  as  image,  g(x),  a smooth 
version  of  the  object. 

K 

g(x)  = h(x- X.)  (6) 

If  x^  are  close  to  each  other  then  the  smoothing  of  the  image  makes  the  stars  indistin- 
guishable. 
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APERTURE  PLANE 

OBJECT  FOURIER  TRANSFORM 


X 


LENS 
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Schematic  of  the  diffraction  problem. 

(a)  Unknown  object  f(x). 

(b)  Fourier  transform  F{u)  of  f(x). 

(c)  Resulting  image  g(x). 
d:  Lens  diameter. 
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The  developed  algorithm  using  the  assumption  of  Eq.  (5)  achieves  a restoration  of 
the  missing  high  frequencies,  for  {fu(  > a),  and  therefore  recovers  completely  the 
object.  Below  we  are  going  to  look  at  the  specific  case  of  double  stars  when  their  image 
is  greatly  distorted  by  the  diffraction  effects  and  noise  makes  the  object  appear  as  a 
single  star  (i.e.  , the  resolving  power  of  the  telescope  can  not  separate  the  two  stars). 
The  iteration  has  the  following  scheme 

IF{u)H(u) 

with  0 < H(u)  < 1 (7) 

[l-H(u)]  G^{u)  |u|  > a 


and 


’n+1 


(X) 


f (x)  if  f (x)  > C 
n n — n 

0 if  f (x)  < C 
n n 


(8) 


where  f (x)  is  the  inverse  Fourier  transform  of  F (u)  and  C an  adaptive  truncating 
n n n 

level,  determined  by  the  algorithm.  A complete  description  of  the  algorithm  is  given 
in  Reference  2.  Hence,  in  Eq.  (7)  with  the  assumption  of  coherent  object 

0 I u I > a 

1 I u I < a 


H(u) 


(4) 
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3 4 

If  the  object  is  incoherent  with  illuminance  f(x)  then,  ' the  illuminance  g(x)  of  its  image 


IS 


00 


g(x)  = / f(i)h(x-e)de 


(9) 


with 


h(x)  = I h(x)  I 

Thus  from  Fourier  transform  properties  we  obtain 

1 ^ I I 

0 I u I > 2a 


H(u) 


(10) 


and  in  the  relation  E<1.  (7)  of  the  iteration,  in  this  case,  we  must  use  H(u)  instead  of 
H(u). 


The  versatility  of  inserting  H(u)  inside  the  algorithm  Eq.  (7)  makes  it  feasible  to 
skip  the  deconvolution  problem. 

Example;  The  simulated  coherent  object  f(x)  is  shown  in  Fig.  2(a)  with 

f(x)  = 6(x-  x^)  + 0.5  6(x-  x^-  4Ax)  + n(t)  (11) 


where  Ax  is  the  sampling  interval  and  n(t)  is  white  noise  uniformly  distributed  from 
[-0.1,0.!].  The  Fourier  transform  F(u)  = R(u)  + j X(u)  is  shown  in  Figure  2(b).  Fig- 
ure 3(a)  shows  the  image  g(x)  of  the  object  and  Fig.  3(b)  shows 


G(u)  = F(u)H(u)  = 


F(u)  I u I < a 
0 I u I > a 


(12) 


the  Fourier  transform  of  the  image. 


In  Fig.  4(a),  (b)  and  (c)  the  0-th,  4-th  and  30-th  step  of  the  iteration  are  shown. 
The  object  has  been  recovered  completely  at  the  end  of  the  30-th  iteration. 


In  the  case  when  the  two  stars  are  of  the  same  power  a^  = a2  the  limits  of  the 
iteration  have  been  calculated  numerically.  The  unknown  object  consists  of  two  equal 
pulses  located  7 sampling  intervals,  (Ax),  apart  and  the  added  noise  is  white,  uniform- 
ly distributed.  Thus  the  unknown  object  f(x)  is 

f(x)  = 6(x-Xj)  + 6(x-x^  - 7 Ax)  + n(t)  = s(t)  + n(t)  (13) 


In  the  diagram  of  Fig.  5 the  vertical  axis  is  the  S/N  ratio  and  the  horizontal  axis  is 
analog  to  the  diameter  of  the  lense.  A FFT  (Fast  Fourier  Transform)  of  256  points 
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COMPLETE  ESTIMATION 
(0%  ERROR  AREA). 


ERROR 


has  becii  used  and  d = 1 means  that  the  Fourier  transform  of  the  object  is  known  only  for 
10  sampling  intervals. 

The  upper  curve  in  the  diagram  indicates  the  limits  for  which  the  algorithm  re- 
covers the  signal  completely.  The  second  curve  indicates  where  the  algorithm  recovers 
the  signal  with  an  accuracy  itf  one  sampling  interval  (i0.4‘7c  error). 
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A CONTRIBUTION  TO  EDGE  DETECTION  USING  TWO-WAY  ANOVA  TECHNIQUES 
P.  Legakis  and  L.  Kurz 


In  this  report,  the  edge  detection  techniques  using  ANOVA  techniques  in  con- 
junction with  quantile  statistics,  described  by  the  authors  previously,^  are  generalized 
from  one-way  to  two-way  ANOVA  procedures.  The  generalization  is  of  importance  if 

the  image  to  be  processed  is  severely  corrupted  by  noise.  Since  efficient  quantile 

2 3 

estimation  techniques  are  readily  available,  ’ and  reduction  and  preconditioning  of  the 
data  set  by  quantile  partitioning  makes  the  simple  fixed  effects  model  of  ANOVA  appli- 
cable to  gray  level  and  texture  edge  detection  problems,  the  procedure  described  here 
is  particularly  attractive  in  practical  applications. 

A.  Mathematical  Formulation  of  the  Problem 

Using  the  same  notation  and  physical  modelling  as  in  Ref.  1,  the  framework  for 
the  generalized  procedure  follows  the  usual  steps  in  bridging  the  gap  between  the  one- 
way and  two-way  ANOVA  procedures. 


In  the  two-way  ANOVA  case,  the  point  of  interest  (r,  s)  is  the  center  of  a neigh- 
borhood which  in  this  case  is  a rectangular  array  of  M.  rows  and  M„  columns.  Thus. 

A B ^ 

the  rectangular  array  X of  the  neighborhood  points  consists  of  elements  { X..}  such 

that  ^ 


X..  = q 

rs  ij  ^r-w+i;  s-z+j 
i = 1,2,-  • j = 1, 2,  • • - .Mg 


r = w,w  + l,  • s = z,ztl,---,L 


n-  z 


w = M^  -r  2 ; z = Mg  -i-  2 
Adopting  the  same  notation  as  for  the  one-way  case. 

rs^  = ‘^r-w+JM/ 

•A 


r = w,w+l,---,L^_^;  s = z.  z +1 , • • . , L^- z 


(1) 


(2) 


Here,  again,  the  expression  for  the  quantiles  has  M.  row  indices  and  M_  column  indices. 
The  restriction  on  the  values  of  r and  s follows  the  same  reasoning  as  for  the  one-way 
case  and  it  means  that  the  first  and  last  (w-1)  rows  and  the  first  and  last  (z-1)  columns 
will  not  be  classified  but  they  will  be  used  in  the  classification  of  points  beginning  from 
(w,  z)  and  terminating  with  the  point  (L^-w,  L^-  z).  The  requirements  imposed  on  the 
scanner  resolution  (N  , N ) are  that  N be  much  greater  than  M_xn  and  that  N be  much 

* s S ^ y 

greater  than  where  n is  the  data  reduction  factor*  If  the  above  restrictions  are 
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met  and  if  the  minimum  distance  between  successive  vertical  and  horizontal  edges  is 
much  greater  than  M^xn  and  M^g  respectively,  the  effect  of  ignoring  all  border  points 
in  the  classification  as  well  as  false  indications  of  edges  due  to  the  presence  of  actual 
edges  in  the  neighborhood  will  be  minimized  or  eliminated. 


The  corresponding  hypothesis  testing  problem  for  the  two-way  ANOVA  case  take 
the  form: 


»A 

rows  are  homogeneous 

— 

no  horizontal  edge 

«B 

columns  are  homogeneous 

- 

no  vertical  edge 

rows  are  heterogeneous 

- 

horizontal  edge 

^B 

columns  are  heterogeneous 

— 

vertical  edge 

and  the  pertinent  statistics  for  the  neighborhood  points  base  on  (r,  s)  are 


M, 


X.  =-jj 
rs  1.  M 


B 


7 q 

j-l  r-w  + i; 


; s-z+j 


M, 


3r..=  T^  V q . 

rs  j M.  ^r-w  + i;  s-z+j 
A 1 = 1 •' 


M.  M 


1 


rs^  • • ■ M.M 


A B i=l  j=l 
M 


’r-w  + i;  s-z+j 


si  = X - M.M„  X,^. 

rs  A B rs  1.  A B 
1 = 1 


M 


B 


2 yA  ^B  _ _ 

rs^W  ^qr-w  + i;  s-z+j  rs^i.  rs^’j  ^ 


— 2 

X. 


f . = 

rs  A 


(M„-  1) 

' B 'rs  A 


rs  w 


r s^B 


<^A-l>rsS 


r s w 


(3) 


(4) 


(5) 


(6) 


(7) 


(8) 


(9) 


(10) 
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The  statistics  f , and  i'  are  sensitive  to  the  presence  of  edges  in  the  horizontal  and 
rs  A rs  B 

diagonal,  or  vertical  and  diagonal  directions,  respectively;  namely. 


< 

rs  A 

^A 

and 

rs^B  ^ 

^B 

-»  no  edge 

f*  > 

T * 

and 

fr,  < 

Tx. 

— horizontal  edge 

rs  A — 

A 

rs  B 

B 

^A  > 

r s A — 

and 

rs^B- 

^B 

— diagonal  edge 

^A  < 

rs  A 

^A 

and 

rs^B- 

^B 

— vertical  edge 

Due  to  asymptotic  normality,  the  ^^f  statistics  have  the  central  and  non-central 

distributions  with  (M.  - 1),  (M  - 1)(M„-  1)  degrees  of  freedom  in  the  case  of  F.  and 
A A JD  r s A 

(Mg-  1),  l)(Mg-  1)  in  the  case  of  small  sample  sizes  are  used,  the 

asymptotic  normality  assumptions  are  no  longer  valid  but  the  expressions  for  f . and 
^^fg  are  still  useful  as  means  for  obtaining  possible  edges.  The  choice  of  the  appropri- 
ate thresholds,  T^,  Tg  as  well  as  the  choice  of  neighborhood  size  Mg  is  best  done 

through  simulation  of  known  patterns. 

B.  Simulation  Results 

In  this  section,  the  advantages  of  using  quantile  statistics  in  two-way  ANOVA 
techniques  over  simple  ANOVA  techniques  applied  to  the  original  data  for  the  solution 
of  gray  level  and  texture  edge  detection  problems  are  demonstrated.  The  reference 
pattern  is  the  same  as  given  in  Fig.  1 of  Reference  1.  If  the  signal-to-noise  ratio  is 
low,  the  quantile  techniques  based  on  two-way  ANOVA  procedures  perform  very  well 
(compare  Figs.  1 and  2,  also  Figures  3 and  4).  Similar  results  are  obtained  for  prob- 
lems involving  texture  edge  detection  (see  Figures  5 and  6). 

In  general,  the  simulation  studies  point  to  the  fact  that  quantile- ANOVA  techniques 
are  much  more  efficient  than  simple  ANOVA  techniques  for  the  solution  of  edge  detection 
problems.  In  severe  noise  environment  the  use  of  two-way  quantile- ANOVA  techniques 
is  mandated  if  meaningful  edge  detection  is  to  be  achieved. 
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M-GRAY  LEVEL  DETECTION  FOR  MODERATE  AND  HIGH  SIGNAL- TO- NOISE  RATIOS 
P.  Legakis  and  L.  Kurz 

The  edge  detection  and  the  associated  two- gray  level  detection  problems  have  been 

12  3 

treated  by  the  authors  elsewhere.  ' ' The  generalization  of  the  edge  detection  proce- 
dures from  two-  to  M-gray  levels  is  straightforward  but  there  is  a difficulty  associated 
with  this  extension;  though  edges  are  detected,  there  is  no  indication  what  gray  levels 
are  enclosed  by  the  edges.  This  fact  requires  further  methodological  refinements 
which  permit  a generalization  of  the  gray  level  detection  problem  from  M = 2 to  M>2. 

In  Section  A,  the  noiseless  M-gray  level  detection  is  reviewed  which  sets  the  stage  for 
the  treatment  of  the  noisy  case.  In  Section  B,  the  general  problem  of  M-gray  level 
detection  under  noisy  conditions  is  stated  and  placed  in  a mathematical  framework  of 
hypothesis  testing.  In  Section  C,  a simple  sequential  procedure  for  M-gray  level  detec- 
tion is  outlined,  a test  statistic  which  yields  good  performance  for  moderate  and  high 
signal-to-noise  is  introduced.  In  Section  D,  a threshold  for  the  statistic  which  mini- 
mizes the  average  probability  of  gray  level  misclassification  is  derived.  Finally, 
computer  simulations  in  support  of  the  theory  presented  in  this  report  are  given. 

A.  M-Gray  Level  Detection  in  the  Absence  of  Noise 

Let  R(c_,c.  .)  be  the  range  of  the  gray  scale  and  let  I(i,j)e  R(c-,c.  .)  be  the  bright- 

U M.  U Wl 

ness  of  an  image  array  H at  the  point  (i,j).  I(i,j)  can  be  represented  by  an  f-bit 
quantization  number  • • • . ) where  b are  binary  numbers  and  1 £ f £ 8.  The 

lower  and  upper  bounds  for  f were  chosen  to  be  1 and  8,  respectively,  because  a gray 
scale  cannot  have  less  than  2 levels,  and  2 = 256  is  more  than  most  images  require  for 
excellent  quality  reproduction.  Each  quantization  number  could,  in  the  limit,  correspond 
to  one  gray  i.evel.  However,  since  for  f = 8 it  would  be  very  difficult  to  establish  and 
reproduce  256  brightness  levels,  a reduced  scale  consisting  of  m < 256  brightness  levels 
is  desirable.  Thus,  a reduced  M-gray  level  scale  consists  of  levels  gj^  such  that 

‘^k- 1 - 8k  ‘^k  ' = 1. 2,  • . M 

where 

Cq  : quantization  number  corresponding  to  the  darkest  point  in  the  gray  scale 
Cj^:  quantization  number  corresponding  to  the  brightest  point  in  the  gray  scale. 

Denoting  by  [ gj^]  the  brightness  of  level  gj^  the  relationship 
[gjl  < [82]  < [83]  < • • • < [gj^] 

holds,  and  for  every  gj^  there  is  at  least  one  quantization  number. 
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In  the  absence  of  noise,  the  transformation  from  quantization  numbers  to  gray 
levels  is  based  on  a sample  of  a single  observation  and  I(i,j),  the  brightness  of  an  image 
array  at  the  point  (i,j),  maps  onto  level  g^^  if 

I(i.  j)  £ [c,^_  j,  Cj^] , k = l,2,---,M  (1) 

Depending  on  the  gray  scale  partition,  i.  e. , the  choice  of  c c^,  c^,  • • • , Cj^  which  in- 
volves not  only  the  choice  of  M but  also  the  actual  gj^  one  could  generate  different  repro- 
ductions of  the  same  image.  In  fact,  even  if  M is  fixed  and  the  choice  of  ‘ 

made,  one  could  still  generate  different  representations  of  the  same  original  image 
array  based  on  the  choice  of  characters  that  are  used  in  an  overstrike  arrangement  to 
produce  a given  brightness  level. 

If  no  additional  a priori  information  about  the  image  is  available,  the  assumption 
is  made  that  the  gray  scale  is  partitioned  into  M equal  levels,  i.e.,  into  levels  which 
consist  of  the  same  number  of  quantization  numbers.  This  assumption  seldom  affects 
the  quality  of  the  reproduced  image  in  the  noiseless  case.  As  it  will  be  shown  in  sub- 
sequent sections,  this  assumption  does  not  affect  the  gray  level  detection  in  the  noisy 
case.  Logically  this  insensitivity  of  gray  level  detection  to  the  partition  of  the  scale 
follows  from  the  fact  that  in  an  unequal  partition  of  the  gray  scale,  the  different  levels 
still  differ  by  a shift- in-the- mean  in  addition  to  the  change-of- s tale . The  statistic  to 
be  used  for  any  gray  level  detection  is  sensitive  to  both,  and  unequal  partition  does  not 
influence  the  decision  process.  For  simplicity,  equal  partitioning  of  the  gray  scale  will 
be  assumed  for  the  remainder  of  the  report.  Thus,  in  the  noiseless  case,  once  the 
gray  scale  has  been  partitioned,  the  identification  of  an  image  array  point  with  a parti- 
cular gray  level  follows  directly  from  Equation  (1).  The  problem  to  be  resolved  is  what 
happens  when  data  is  corrupted  by  noise. 

B.  The  M-Gray  Level  Classification  Problem  Under  Noise  Conditions 

The  two- gray  level  classification  problem  under  noise  conditions  has  received 

4 

considerable  attention,  and  procedures  such  as  analysis  of  variance,  slippage  algor- 
ithms,^ as  well  as  the  quantile  procedures  of  Refs.  1, 2,  3 are  quite  adequate  for  almost 
any  noise  environment.  The  problem  was  solved  by  posing  it  in  a hypothesis  testing 
framework  with  level  0 represented  by  a distribution  of  given  mean  and  variance  and 
level  1 by  another  distribution  of  different  mean  and  the  same  variance.  Note  that  the 
assumption  of  equal  variance  amounts  to  equal  partition  of  the  gray  scale. 

One  can  generalize  the  above  problem  to  the  M-gray  level  case  by  considering  a 
gray  scale  that  is  spanned  by  M distribution  functions  having  different  means  and  the 
same  vaTiances  if  equal  partition  is  assumed,  or  having  different  means  and  different 
variances,  in  the  more  general  case.  A mathematical  statement  of  the  problem  now 
follows . 
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Let  the  gray  scale  be  spanned  by  M distribution  functions  Fj(x),  - • • , Fj^(x)  rep- 
resenting the  M-gray  levels,  and  let  Fj^(x)  = F(x-  where  is  the  mean  for  level 
k and 

Ml  < < • • • < 

or  the  distributions  are  all  identical  except  for  a shift- in- the -mean.  Assume  the  avail- 
ability of  reference  sample  Xj^  and  that  Xj^  - F(x-  fj^),  k = 1,  2,  • • •,  M.  Let 

= yj,y2,y3.  • • •»yii^  ^ sample  from  an  image  array,  H,  that  needs  to  be  classified 
as  coming  from  one  of  the  F(x-  Mj^).  A classification  procedure  could  be  either  sequen- 
tial, consisting  of  at  most  M steps,  or  parallel  requiring  that  all  M steps  be  executed 
simultaneously,  or  some  combination  of  these  procedures.  The  choice  between  a 
sequential  and  a parallel  procedure  involves  a tradeoff  between  the  higher  cost  of  im- 
plementing the  parallel  procedure  and  better  performance  in  terms  of  less  waiting  time 
resulting  from  the  simultaneous  execution  of  M steps. 

If  a sequential  procedure  is  followed,  the  sample  to  be  classified  is  compared  to 
the  reference  sample  X^^  and  a threshold  statistic  results.  Then  hypothesis  is 
accepted  or  rejected  on  the  basis  of  whether  Tj^  exceeds  or  does  not  exceed  a threshold. 
If  is  accepted,  meaning  that  Xj^  - F(x-  Mj^)>  the  classification  stops;  otherwise,  the 
test  sample  is  matched  with  reference  sample  until  a match  occurs  or  the  refer- 

ence samples  are  exhausted.  In  the  latter  case,  where  all  reference  samples  have  been 
exhausted  without  making  a classification,  a decision  is  made  on  the  basis  of  min  or 
max  (Tj,  T^,  • • • , Tj^).  If  a parallel  procedure  is  followed,  a set  of  test  statistics 
T j , T^t  Tj,  • • • , Tj^  is  generated  and  the  test  sample  is  classified  on  the  basis  of  min  or 
max  (Tj,  T^,  • • • , Tj^).  It  must  be  noted  that  the  alternative  to  hypothesis  is  taken 
to  be  not  because  the  gray  levels  are  ordered  in  real  images  (they  are  not)  but  be- 

cause only  adjacent  distributions  are  expected  to  cause  a mis  classification  on  the 
hypothesis.  Also,  j is  not  considered  as  an  alternative  at  the  kth  step  because  its 
rejection  at  the  previous  stage  is  the  reason  why  the  procedure  has  reached  the  kth  step. 

In  the  remainder  of  the  report,  sequential  procedures  are  explored  in  detail  be- 
cause they  are  simpler  and  more  economical  to  implement.  The  extension  of  these  to 
parallel  operations  follows  directly.  Usually,  sequential  procedures  are  used  to  mini- 
mize the  number  of  samples  required  before  making  a decision.  Here,  all  samples  are 
available  and  the  emphasis  is  on  minimizing  the  number  of  steps  required  to  arrive  at 
a classification.  If,  in  addition,  use  is  made  of  the  classification  of  the  previous  sample 
as  a starting  point  of  comparison,  the  number  of  steps  required  for  classification  is 
further  reduced  because  it  is  expected  that  neighboring  points  frequently  belong  to  the 
same  class. 
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C.  M-Grav  Level  Sequential  Pattern  Detection 

Let  Xj^  - F(x-  jij^)  and  be  the  reference  and  test  samples,  respectively,  con- 
sisting of  n.i.i.d.r.v.  observations  each.  The  test  statistic,  Tj^,  for  the  kth  step  of 
the  procedure  is  defined  as  follows: 


On  the  basis  of  let  qk,50+y 

the  50th  percentile,  where 

‘^k,  50-y 

/ dF(x-^)  = p 

_ QO  ' • 


5.' 

] be  a quantile  interval  centered  around 


Sample  consists  of  points  from  at  most  two  distributions  corresponding  to  any 
two  gray  levels.  The  cases  of  interest  are  those  for  which  the  quantile  interval  con- 
tains observations  from  adjacent  distributions.  In  all  other  cases  the  contributions  to 
misclassification  are  considered  negligible.  The  latter  case  corresponds,  roughly,  to 
high  signal-to-noise  ratio  conditions. 

The  test  statistic  Tj^  is  now  formed  by  counting  the  number  of  observations  in  Y ^ 
that  fall  inside  [qj,.  5O- y' ‘^k,  50  fyl 

T,  = Card{y.:  q,  < V-  < q,  cn.  } 

k 'j  ^k,  50-\— 'j  ^k,  SO+y 

k = 1, 2, • • ■ , M;  3 - 1 , 2,  - ' • ,n 

The  formulation  of  the  test  statistic  is  distribution-free  since  quantile  information  rather 
than  exact  knowledge  of  the  underlying  distribution  is  required,  from  an  implementation 
point  of  view,  Tj^  is  very  simple:  it  requires  operations  of  the  order  of  0(n)  once  the 
quantile  interval  is  established.  A simple  procedure  for  establishing  the  quantiles  from 
the  observed  data  even  if  the  data  is  contaminated  is  available  in  Ref.  6 where  depend- 
ence in  m- interval  detectors  is  also  treated. 

Under  the  hypothesis  the  observations  of  Xj^  and  Y^  are  generated  by  the 

same  continuous  distribution  F(x-pj^).  From  the  classical  "occupancy"  model  which  is 
extensively  treated  in  the  literature,  the  statistic  Tj^  is  binomially  distributed,  i.e,, 

Tj^  ~ B(n,  p|^  ) with  Ppjj^  denoting  the  probability  that  an  observation  from  falls  within 
the  prespecified  quantile  interval  [qj^  50-y’^k  50 +y^  total  number  of  trials. 

Thus, 


' The  50th  percentile  is  chosen  because  that  is  where  most  of  the  information  is 
concentrated  in  the  shift- in- the- mean  problem. 


♦fc*  *4 


Prob  (T^)  = 
k 


^ ■ Pa 

k k 
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^T,  ^Ph.  =0»  1«  ■ " ‘.n 

k k k 


For  example,  if  n = 20  and  ~ ~ ^ which  is  very  small,  as  ex- 

k ^k  “k 

pected. 

The  reason  why  the  classification  procedure  has  reached  the  kth  step  is  that 
P^k’  ^ = •••,k  - 1 have  been  rejected.  The  alternati/e  at  the  kth  step  is  and  the 

distribution  of  Tj^  under  is  assumed  to  also  be  of  the  binomial  form,  i.e.  , 

T.  - B(n,  p ) 


where 


+°H. 


The  formulation  vjf  the  distribution  of  Tj^  under  the  alternative  is  also  in  agreement  with 

what  is  intuitively  expected.  Namely,  for  large  signal-to-noise  ratios,  is  small 

and  so  is  p„  . When  the  alternative  approaches  the  hypothesis,  0,  , — 1 and 

rik+1  k+1 

T,  /u  T,  /„  , Thus,  under  the  alternative 


" , ^k  "^k 


'V’Hk+lXtl  "^k 


Prob  (T  ) = 
"k+1 


0 Oth. 
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D.  Classification  Procedure  for  the  kth  Step  and  a Threshold  Specification 

As  it  was  noted  in  the  previous  section,  the  classification  procedure  has  reached 
the  kth  step  because  the  first  (k-1)  alternatives  have  been  rejected.  The  classification 
for  the  kth  step  proceeds  as  follows; 

(1)  From  and  5O- y' % 50 +y’ ' "^k 

(2)  Compare  Tj^  with 

(3)  Decide  that  is  true  if  Tj^  > or  that  is  true  (only  temporarily  until 

verified  in  the  subsequent  step)  if  Tj^  < Tj^^. 

The  threshold  Tj^^  is  chosen  to  minimize  the  average  probability  of  misclassification 

^ek  " V“k  + 1^ 

where 

Prob(Hj^)  : a priori  probability  of 

Prob(Hj^^j/Hj^):  type  I error 
Prob(Hj^^j)  ; a priori  probability  of 
Prob(Hj^/Hj^ ):  type  II  error 

It  is  customary  to  assume  Prob(Hj^)  = , ¥-k,  unless  additional  information  is  avail- 

able. Since 


ko 


.^  = Prob(H^^l/H^)  = j;  Prob  (V 


T,  =0  “k 
k 


and 


|i^  = P,ob<H/H^^,)=  y Prob  (T^) 


T,  *‘k+l 

ko+1 


Equation  (2)  becomes 
'T 


ek 


(3) 


where 


Probj^  (Tj^)  = 

k 1 
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n T,  n-  T, 

/ \ K k 

’'k’'’Hk  + l''«k+l 


Prob  (T  ) = 
k+1  ^ 


0 Oth, 


It  is  well  known  from  Bayesian  theory  that  is  the  solution  of 
Prob(H^)Prob(T^/H^)  = Prob(H^^, ) Prob(Tj^/Hj^^^) 


Therefore 


1 " ^k  1 " 


*-®k+lPH  T 


^ - ®k+lPH. 


®k+lPH 


Taking  the  natural  logarithm  of  both  sides  of  Eq,  (4),  results  in 


T,in(g ^ 

^ ®k+lPHj^ 


^ ■ ®k+lPHj^ 


1 - ®l,xlP 


k+l^H, 


) = nfn( j-y 


‘ - ®k+lPH, 


1 


Equation  (6)  gives  the  threshold  Tj^^  which  minimizes  the  average  probability  of  mis- 

classification  in  the  kth  step.  Since  Tj^^  as  given  above  is  not  likely  to  be  an  integer 

and  since  it  must  be  constrained  to  integer  values,  T,  is  given  in  terms  of  fT,  1 and 

ko  ko 
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iTkol  where 

[Tj^qJ  is  the  largest  integer  less  than 

is  the  largest  integer  greater  than  Tj^^. 

To  select  between  and  one  has  to  substitute  into  Eq.  (3)  and  retain  the 

value  which  minimizes  it.  The  parameter  6^+1  ^'^nges  from  0 to  1 . If  6^+1  ~ 

(6)  takes  an  indeterminate  form,  while  for  6^  + 1 "^ko  ~ expected.  The  total 

probability  of  error  misclassification  is  minimized  when  it  is  minimized  in  each  of  the 
k steps,  and 

P.® -5!  ' ^ >P.k  W 

k=  1 

where  given  by  Eq.  (3)  with  the  threshold  specified  by  Equation  (6). 

Figure  2 demonstrates  the  performance  of  the  procedure  described  in  this  report 
under  moderate  noise  conditions.  The  reference  pattern  is  given  in  Figure  1. 

Joint  Services  Technical  Advisory  Committee 

F44620-74-C-0056  P.  Legakis  and  L.  Kurz 

IBM 
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BIVARIATE  M-INTERVAL  CLASSIFIERS  WITH  APPLICATION  TO  EDGE  DETECTION 
P.  Kersten  and  L.  Kurz 

In  197  2 Ching  and  Kurz^  introduced  a statistic  which  is  a generalization  of  the 
sign  test.  The  sign  test,  one  of  the  first  nonparametric  statistics,  is  usually  applied 
when  testing  the  hypothesis  M=M^  against  the  alternative  M > or  M < or 

M^M^,  where  is  the  median.  Accordingly,  the  sign  test  is  defined  by 
n / 1 X > 0 

X sgn(X.-M  ) where  sgn(x)  = < 0 x = 0 and  X.  are  independent  and  identi- 

i=l  ‘ ° V -1  X < 0 ^ 

cally  distributed  samples  from  the  population  under  test.  To  generalize  the  sign  test 
one  assumes  that  a priori  knowledge  of  (m-1)  quantiles  {aj,...,a^  j}  such  that 
F ^(i/m)=a^,  where  F is  the  cumulative  distribution  function  (CDF)  of  the  sampled  dis- 
tribution, are  given.  Using  these  quantiles,  the  observation  space  is  partitioned  into 
m disjoint  regions  Aj^  = (aj^  j,aj^],  i=l ,...,  m where  a^=-oo=-a^.  Assuming  that  F is 
strictly  increasing  so  that  the  a^^  are  unique  and  the  probability  density  function  f=F'  is 
continuous  implies  f(a^)=f(a^)=0.  With  these  definitions  and  hypotheses  one  defines 
the  m-Interval  Partition  Statistic  as 


T 


"111/'' 


where  b.  are  the  scores  associated  with  each  partition  and  I ..a  \ i®  the  indicator 

i tXj^eA  } 

function  of  the  event  (Xj^e  A^),  i.  e.  I|j^  e A.}  ^ '^i  ° otherwise.  Now  t 


may  be  rewritten  as  t = — X where  n.  is  defined  as  the  number  of  samples 

i=l  *■ 

falling  into  the  partition  Aj^.  If  m=2  and  bj=-l=-b2,  then  t is  equivalent  to  the  sign 
statistic  provided  E(X^)  =0  and  f{x)=f(-x)  which  implies  tha  M^=E(X^). 

Ching  and  Kurz^  demonstrate  that  this  statistic  is  nonparametric  in  the  sense 
that  its  Type  I Error  is  independent  of  the  functional  form  of  the  underlying  distribu- 
tion and  robust,  i.  e.  , its  performance  is  insensitive  to  changes  in  the  underlying  noise 

distribution.  It  is  interesting  to  note  that  the  generalized  sign-test  classifiers  have 

2 

been  treated  in  a unified  framework  in  Cochrane  and  Kurz.  This  framework  is 

general  enough  to  include  not  only  the  m-Interval  Partition  Tests,  but  also,  the  linear 

rank  statistics.  Both  papers  discuss  the  robustness  of  the  m-Interval  Partition  Test 

and  its  efficiency  for  large  sample  sizes.  Application  of  rank  tests  to  classification 

3 4 5 

problems  have  also  been  considered  in  Woinsky  and  Kurz,  ’ and  Chadwick  and  Kurz. 


Hodges  obtained  a bivariate  extension  of  the  sign  test  and  derived  its  null  distri- 
bution. The  general  setting  for  the  application  of  this  statistic  is  a bivariate  test  of 
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hyp>othesis.  Specifically,  given  two  sets  of  independent  bivariate  samples,  the  first 
from  a bivariate  distribution  function  with  mean  (0,0)  and  the  second  from  a bivariate 
distribution  function  whose  mean  may  be  (0,0)  and  (p,|),  one  seeks  to  determine  from 
which  population  the  last  sample  originated.  Hodges  considers  the  comparison  of  two 
treatments  where  one  has  two  sets  of  bivariate  samples  (x!,  yj)  and  (xV,  yV),  i=l, . . . ,n. 
From  these  samples,  a single  sequence  of  bivariate  samples  is  formed  ^ = (x.,  yj^)  = 
(xV  - x),  y'J  - y))  which  belong  to  one  of  the  following  hypotheses: 

= (x^.  y^)  have  mean  (0,  0),  i=l,  . . , , n, 

Hj  : = (x.,  y.)  have  mean  (p,  |)  ^ (0,  0),  i=l, . . . ,n. 

The  central  idea  in  this  test  consists  of  projecting  the  bivariate  samples  onto  an  axis 
passing  through  the  origin  at  an  angle  0 (hereafter  referred  to  as  the  0-axis)  and  then 
applying  the  univariate  sign  test  along  this  axis  for  all  possible  values  of  0€[O,  2ir]. 

One  then  uses  the  maximum  value  assumed  by  the  sign  test  for  all  0 to  accept  or  reject 
the  hypothesis.  If  one  defines  _l_g  as  the  unit  vector  of  an  axis  at  an  angle  0 with  res- 
pect to  the  x-axis,  then  one  can  express  the  Bivariate  Sign  Test  as 

B = sup  £ sgn(z  .J_  ) 

0 < e < 2n  j = l J 

where  z..  1-  is  the  dot  product  of  the  two-dimensional  vector  z.  with  1„.  i.  e.  z..  I„. 

— j — 0 — j —9’  — j — e> 

is  the  projection  of  £.  onto  the  S-axis.  Intuitively,  the  angle  6 at  which  B is  maximum 

J . 1 

should  be  approximately  0=tan  (|/p)  if  is  true  since  the  shift  of  mean  occurs  along 
this  axis.  Under  the  hypothesis,  no  direction  of  0 should  be  favored  and  thus  one  ex- 
pects the  supremum  to  occur  at  any  angle.  Unfortunately,  both  the  univariate  and  bi- 
variate sign  tests  discard  all  the  magnitude  information  in  the  samples.  The  uni- 
variate m-Interval  Partition  Statistic,  retains  enough  magnitude  information  to  obtain 
a more  efficient  test  statistic;  but,  at  the  same  time,  does  not  surrender  its  robust- 
ness since  the  magnitude  information  is  quantized  via  the  partitioning  of  the  observa- 
tion space.  The  m-lnterval  Partition  Test  may  be  extended  to  the  bivariate  case  in 
precisely  the  same  manner  as  the  sign  test  is  extended  to  the  bivariate  case. 

The  motivation  for  considering  the  tests  described  above  arises  from  its  applica- 
tion to  edge  detection  problems  where  robustness  is  an  essential  ingredient.  For  in- 
stance, in  picture  processing,  the  insensitivity  of  the  classifier's  performance  to 
changes  in  the  underlying  noise  distribution  is  more  important  than  using  an  optimum 
test  under  strict  assumptions  on  the  noise  distribution  function  to  achieve  the  most 
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efficient  operation.  In  Section  A,  where  a more  detailed  description  of  the  Bivariate 
m-Interval  Partition  Statistic  is  presented,  the  nature  of  the  test  will  guarantee  its 
robustness.  In  Section  B,  two  sets  of  scores  for  this  classifier  are  discussed  and 
the  performance  of  this  statistic  with  these  scores  is  the  subject  of  Section  C.  An 
application  of  the  Bivariate  m-Interval  Partition  Statistic  to  the  edge  detection  of  a 
block  pattern  is  considered  in  Section  D along  with  simulations  of  the  test  in  various 
noise  environments. 

A.  Description  of  the  Bivariate  m-Interval  Partition  Statistic 

The  key  to  Hodges  extension  of  the  sign  test  to  the  bivariate  case  was  to  project 

the  samples  upon  the  6-axis  and  then  applying  the  sign  test  along  this  axis  for  each  0. 

The  supremum  of  these  test  statistics  is  used  to  accept  or  reject  the  hypothesis. 

Under  suitable  conditions  the  m-Interval  Partition  Statistic  is  extended  in  the  same 

manner.  Define  A^(0)  = (a^  j{0),a^(0)]  , i = l,  ...,  m as  the  partition  along  the  0-axis 

so  that  Fq  Si/m)  = a. (6)  where  F.  is  the  marginal  distribution  of  the  samples  pro- 
b I u 

jected  upon  the  0-axis.  One  may  define 

1 A ^ 

'"ji  Si  V{Z.,i,cA.(9l) 

and  T'  as  sup  T(0)  . 

0 6 ^ 2tt 

Throughout  this  analysis  it  is  assumed  that  the  joint  PDF  is  symmetric  in  that 
f(x,  y)=f(-x,  -y)  and  that  m,  the  number  of  partitions  is  even.  The  scores  bj^  are  as- 
sumed to  be  odd  with  respect  to  the  origin.  As  presently  defined,  this  statistic  is  not 
feasible  to  calculate  since  T(0)  must  be  evaluated  for  each  6e  [0,  2 tt]  and  accordingly 
m-1  quantiles  a.AQ)  need  to  be  known  for  each  6.  Therefore,  one  defines  6j^=2ir(i- 1 ) /k 

and  T = max  (T.)where  T.  =T(0.).  To  calculate  T only  (m-l)k  quantiles  need  to  be 
l<L<k  ^ ^ ^ 

known.  The  Bivariate  m-Interval  Partition  Statistic  is  defined  by  max(T  j, . . . , T^^) 

which  is  less  than  or  equal  to  sup  T(0)  , since  the  supremum  is  over  a larger 

0 < 0 < 2Tr 

class.  Thus  T will  not  reduce  to  the  Bivariate  Sign  Test  for  b^=-l=-b2  and 
F"  ^ ( 1 /2)=a  j (0)  = 0.  However,  this  presents  no  problem  since  T may  be  considered 
upon  its  own  merits,  using  the  Bivariate  Sign  Test  only  as  a bench  mark.  For  com- 
putations used  in  this  paper,  6^  I = tt/12  was  found  to  yield  acceptable  Type  I 

error  and  power. 

For  the  set  of  quantiles  a. (6)=FQSi/m),  i = l,...,m-l,  and  -a  =oo=a , one 

1 U O Til 

needs  to  know  the  loci  of  these  a^^  for  all  6.  Since  the  joint  PDF  is  symmetric  and 
assuming  the  joint  CDF  F(x,  y)=  F(x)  F(y),  the  quantiles  may  be  represented  concisely 
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as  in  Fig.  1 where  F(  • ) is  a Cauchy  Distribution.  Figure  1 is  representative  of  the 

patterns  one  obtains  for  the  quantile  loci  for  other  distributions  such  as  the  Bivariate 

Double-Exponential  and  Gaussian.  Space  considerations  prevent  the  illustration  of 

the  loci  for  these  latter  distributions.  The  intersection  of  these  curves  with  any  axis 

passed  through  the  origin  at  angle  6 yields  the  octiles  of  the  marginal  CDF  F (x), 

0 

Figure  1 illustrates  the  futility  of  trying  to  obtain  a small  sample  distribution 
in  a manner  paralleling  Hodges^  or  Ching  and  Kurz,  ^ Hodges  and  later  Klotz^ 

O IM.tllil.  ! M .\i  V \i  . I -- 
90*  □ SIMl  ! A I Kl-  \ Ml  I 


Fig.  1 Plot  of  quantiles  bivariate  cauchy- 
independent  components 

obtained  the  small  sample  distribution  by  setting  up  a correspondence  between  the 
possible  values  of  T(0)  and  the  sample  paths  of  a random  walk.  Unfortunately,  the 
fact  that  0 is  continuous  and  that  the  shape  of  these  regions  described  is  a function  of 
F(x,  y)  precludes  any  possibility  of  establishing  such  a correspondence  for  the  Bi- 
variate m-lnterval  Partition  Test.  The  same  factors  plus  the  dependence  of  the  T(0. ) 
also  forestall  the  application  of  generating  functions.  Thus  the  small  distribution  is^ 
an  open  problem. 

Practically,  the  quantiles  will  not  be  known  a priori.  To  estimate  these 
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quantiles  a multivariate  Robbins-Monro  procedure  is  used  on  a sequence  of  bivariate 
samples.®  This  is  in  essence  a training  sequence  for  the  statistic  which  can  be  up- 
dated periodically  in  real-time  systems  such  as  picture  transmission  by  sending  a 
p];0.2Lssigned  test  pattern.  Thus  the  partial  characterization  of  the  bivariate  distribu- 
tion via  these  k{m-l)  quantiles  is  realistic. 

B.  The  Scores 

Two  classes  of  scores  were  investigated  for  use  with  the  Bivariate  m-Interval 
Partition  Test.  The  investigation  of  these  scores  is  done  in  the  one-dimensional  case 
with  the  m-Interval  Partition  Statistic.  The  first  class  is  the  locally  most  powerful 
scores  which  may  be  derived  using  the  Neyman-Pearson  Lemma  and  the  small  signal 
assumptions.  Specifically,  these  scores  are  given  by  b^=f(a^_  j )-f(aj^)  where  all  these 
quantities  are  defined  in  Section  I.  These  scores  provide  the  most  powerful  test  of 
hypothesis  if  the  signal-to-noise  ratio  is  small.  Figure  3 contains  a plot  of  the  Type  I 
Error  and  power  for  the  Bivariate  Cauchy,  Gaussian  and  Double- Exponential  distribu- 
tion functions. 

The  second  class  of  scores  is  called  the  minimum  variance  scores.  These  are 
derived  by  minimizing  a functional  related  to  the  m-Interval  Partition  Statistic  at  or 
near  the  hypothesis.  To  carry  out  this  minimization  one  assumes  that  the  scores 
are  given  and  minimizes  the  variance  of  the  statistic  subject  to  certain  constraints. 
This  minimization  is  accomplished  via  the  Calculus  of  Variations  and  yields  the  opti- 
mal PDF.  One  then  inverts  this  solution  to  obtain  an  optimal  set  of  scores  given  this 
PDF.  The  details  are  carried  out  in  Appendix  II  of  Reference  8.  If  f(x)  is  a uni- 
modal,  continuous,  and  symmetric  PDF,  then  the  minimum  variance  scores  are 
given  for  arbitrary  L by  b(x)=(I -f(x) /f(0))  ^ sgn  (x),  |x  | < L and  sgn(x)  for  |x  | > L. 
Figure  2 shows  these  scores  for  the  Cauchy,  Double -Exponential  and  Gaussian  dis- 
tributions with  L=2.  Plots  of  the  corresponding  Type  I error  and  power  for  the  same 
distributions  are  given  in  Figure  4.  The  curves  of  Figs.  3 and  4 are  simulated  and 
their  generation  is  discussed  in  Section  V.  In  application,  b{x)  is  evaluated  at  the 
quantiles  in  order  to  determine  b^. 

Observe  that  the  Type  I Error  or  a -curve  varies  for  different  distributions  and 
thus  by  the  definition  given  in  Section  I,  the  Bivariate  m-Interval  Partition  Test  is  not 
nonparametric  in  the  sense  of  fixed  Type  I Error.  However,  the  one-dimensional 
m-lnterval  Partition  Test  is  nonparametric  (Ching  and  Kurz,  1972).  This  property 
is  lost  in  the  bivariate  extension  due  to  the  dependence  of  T(0)  on  0.  However,  it  ip 
the  partitioning  of  the  0^-axes  and  the  definition  of  the  statistic  as  a set  function  on 
these  partitions  that  gives  T its  robustness.  This  important  property  is  retained. 
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□ DOUBLE  EXPONENTIAL 
^ CAUCHY 


b«  O CAUSSIAN 


Fig.  2 Minimum  variance  scores 


This  fact  is  reflected  in  the  plot  of  the  minimum  variance  Scores  contained  in  Fig.  2 
in  that  the  scores  appear  relatively  insensitive  to  changes  in  the  distribution. 

C.  Estimating  the  Threshold  and  Probability  of  Error  for  the  Bivariate  m-Interval 
Partition  Statistic 


Since  the  small  sample  distribution  of  the  Bivariate  m-Interval  Partition 
Statistic  is  unknown,  one  cannot  obtain  a set  of  theoretical  curves  of  the  Type  I Error 
denoted  by  a and  the  power  denoted  by  0 versus  the  threshold.  However,  by  simula- 
tion, the  a and  p curves  may  be  obtained  by  the  Robbins-Monro  procedure  using  the 
regression  function  M(o')=P(  max  Tj  > v |h^)  -a  where  Tj  is  the  m-Interval  Partition 
Statistic  evaluated  along  the  6j-  axis  and  v is  the  threshold.  The  resulting  recursion 


’'n+ 1 ~ ®n  n ^^(max  T.  > v ) ^n^ 

j ^ 


H 

o 


. A similar  recursion  equation  holds  under  Hj  for  the  derivation  of  the  p- 
Figure  3 gives  the  plots  of  the  Type  I error  and  power  versus  the  threshold 
' variate  Cauchy,  Gaussian  and  Double-Exponential  distribution  using  the 
.St  powerful  scores.  Figure  4 contains  similar  plots  for  the  minimum. 

res  Table  I contains  a list  of  the  Type  II  errors  and  total  probability  of 


1 


IMAGE  PROCESSING 


THRESHOLD 


Fig.  3 Plot  of  type  I error  and  power  vs  threshold 

for  the  locally  most  powerful  scores  N = 500  iterations 


TABLE  1 


Distribution 

Probability  of  error 

Type  I Type  II 

Total  Probability 
of  error 

Signal  to 
noise  ratio 

Gaussian 

0.  1 1 

0.  3 

0.  205 

1.0 

Cauchy 

Scale  0.  5 

0.  10 

0.  22 

0.  160 

undefined 

Exponential 

0.  105 

0.  39 

0.  248 

0.  5 

error  assuming  equally  probable  hypotheses  for  approximately  equal  Type  I error. 
This  table  was  derived  directly  from  Fig.  4 for  a sample  size  of  n = 10  . Note  particu- 
larly the  low  signal-to-noise  ratios  when  examining  this  table  and  Figures  3 and  4. 
These  results  represent  the  performance  of  the  Bivariate  m— interval  Partition  Classi- 
fier in  severe  noise  environments. 
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Fig.  4 Plot  of  type  I error  and  power  vs  threshold 

for  the  minimum  variance  scores  N=500  iterations 

One  can  obtain  a bound  of  theType  I Error  by  using  a result  proven  by  Andre 

9 

Tchen.  This  result  gives  a bound  on  the  probability  that  a maximum  of  a sequence  of 
dependent  or  independent  random  variables  exceeds  a threshold  v , Let 

M =max(X,,  , . . ,X  ),  then  P(M  > v)=P(  U {X.>v})<  f,  P(X.>v)=f,  ( 1 -F.  (v ))  where 
1 " i = l ^ ~i=l  " i=l  ‘ 

F^(x)=P(X^  < x).  The  resulting  bound  is  easily  seen  to  be  P(M^>  v)  <_  min(  1 , ^ (l-Fj(v)). 

The  important  result  that  Tchen  has  shown  is  that  there  exists  a dependent  sequence  of 
random  variables  having  the  distributions  Fj^  and  actually  achieving  the  upper  bound. 
This  bound  can  be  applied  to  the  Tj^  's,  using  the  asymptotic  normality  of  the  T^'s. 

Since  the  Tj^'s  are  strongly  dependent,  this  should  provide  a conservative  bound.  These 
bounds  have  been  included  in  Figures  3 and  4. 

It  is  interesting  to  observe  that  for  the  Double-Exponential  Distribution,  the  m- 
Interval  Partition  Statistic  is  equivalent  to  the  sign  test  when  the  locally  most  power- 
ful scores  are  calculated.  As  mentioned  in  Section  II,  the  Bivariate  m-Interval  Parti- 
tion Statistic  is  not  equivalent  to  the  Bivariate  Sign  Test  since  max(Tj,. . . ,Tj^)£supT(6), 
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0C  [0,  2 tt].  In  fact,  if  the  Type  I Error  for  the  Bivariate  m-Interval  Partition  Test 
via  simulation  is  compared  to  the  null  distribution  of  the  Bivariate  Sign  Test,  the 
latter  exceeds  the  former.  Thus  the  small  sample  distributions  given  by  Hodges^  and 
Klotz^  provide  an  upper  bound  for  the  performance  of  the  Bivariate  m-Interval  Parti- 
tion Statistic  for  the  Bivariate  Double-Exponential  Distribution. 

D.  An  Application  of  the  Bivariate  m-lnterval  Partition  Statistic  to  Edge  Detection 

As  an  example  of  one  of  the  many  applications  of  this  statistic,  the  problem  of 
edge  detection  in  pattern  recognition  is  considered.  Assume  that  a two-dimensional 
block  pattern  with  two  gray  levels,  i.  e.  black  and  white,  is  given.  The  Bivariate 
m-Interval  Partition  Statistic  is  applied  to  detect  and  trace  the  edge  of  the  block  pat- 
tern. The  resulting  edge  can  then  possibly  be  used  for  feature  extraction,  classifica- 
tion, or  initiation  of  more  sophisticated  classification  procedures.  Other  approaches 
to  the  edge  detection  problem  may  be  found  in  Rosenfeld. 

The  test  pattern  illustrated  in  this  paper  is  a block  pattern,  i.  e.  a connected 
solid  pattern  centered  in  a 54x60  array.  The  points  in  the  solid  block  pattern  have 
mean  1 and  those  in  the  backround  have  mean  zero.  It  is  assumed  that  at  each  point 
of  this  54x60  array,  10  random  samples  are  available.  That  is,  there  are  ten  cor- 
rupted copies  of  the  test  pattern  stored.  To  apply  the  Bivariate  m-Interval  Partition 
Test  one  forms  the  bivariate  random  samples  by  pairing  the  samples  of  adjacent 
points.  Then  one  can  use  a sequence  of  bivariate  tests  of  hypothesis  to  locate  and 
trace  the  block  pattern  edge.  The  null  hypothesis  in  each  of  these  tests  is 

H^:  F(x.,  y^),  i.  e.  (x,  y)  has  mean  (0,  0). 

The  alternative  hypothesis  Hj  is  composite  in  that  it  contains  any  of  the  following 
alternatives:  (x,  y)  has  mean  (1,  1),  (1,0),  or  (0,  1).  Under  both  the  alternative  and 
null  hypothesis  F(x,  y)  is  symmetric  about  its  mean.  For  each  adjacent  pair  of  points, 
one  can  either  accept  or  reject  the  null  hypothesis.  For  pairs  of  points  in  the  back- 
ground, one  expects  to  accept  the  hypothesis  and  for  pairs  of  points  in  the  block  pat- 
tern, one  expects  to  reject  the  hypothesis.  The  program  used  to  simulate  the  edge 
detection  procedure  is  executed  in  two  stages.  The  first  stage  locates  an  edge  point 
of  the  block  pattern  and  the  second  stage  traces  the  edge  in  a clockwise  manner  once 
acquisition  of  an  edge  point  has  been  achieved. 

The  first  stage  of  the  program  is  a pattern  search  consisting  of  a sequence  of 
alternating  vertical  and  horizontal  jumps  which  systematically  searches  the  array  for 
the  pattern.  On  horizontal  jumps,  the  pair  of  points  used  to  test  for  the  presence  of 
the  block  pattern  is  taken  in  the  horizontal  direction.  On  vertical  jumps,  the 
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corresponding  pair  is  taken  in  the  vertical  direction.  Once  the  pattern  is  located,  the 
program  back  tracks  along  the  path  of  the  last  jump,  testing  at  each  point  for  the 
edge  - i.  e.  the  black-white  interface. 


The  second  stage  of  the  processing  is  the  edge  tracing.  In  this  stage,  pairs  of 
points  which  are  adjacent  to  the  alleged  edge  point,  are  tested  in  a prescribed  se- 
quence to  determine  the  next  closest  edge  point.  The  program  traces  the  edge  of  the 
block  pattern  in  a clockwise  manner.  In  this  stage  of  the  program,  bivariate  samples 
with  mean  (0,  1)  or  (1,  0)  occur  more  frequently  as  the  alternative  hypothesis.  For 
these  alternatives,  and  error  is  more  likely  to  occur  since  the  shift  of  mean  is  smal- 
ler in  magnitude  than  when  the  alternative  is  (1,  1).  Accordingly,  the  a and  /3 -curves 
described  in  the  previous  section  are  based  upon  an  alternative  with  a shift  of  mean  of 
the  same  magnitude,  i.  e.  from  (0,  0)  to  (.  707, , 707).  That  is,  the  threshold  used  in 


the  simulation  was  established  upon  a worst  case  design  criterion.  In  addition,  ap- 


propriate coding  was  inserted  in  the  program  to  determine  if  the  edge  tracing  routine 
was  "lost,  " When  this  occurred,  a branch  to  a re-initialization  program  was  executed. 
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Fig.  5 Minimum  variance  scores 


534 


IMAGE  PROCESSING 


Simulation  runs  for  the  Bivariate  Normal,  Bivariate  Cauchy  and  Bivariate  Double- 
Exponential  were  made  using  both  locally  most  powerful  scores  and  minimum  vari- 
ance scores.  Figures  5 and  6 contain  illustrations  of  these  runs.  In  Figure  6,  10% 

2 2 

of  the  noise  samples  are  Bivariate  Normal  with  mean  (0,  0)  and  a ^ ^ 2 ” p = 0 

and  90%  of  the  samples  are  Bivariate  Normal  with  mean  (0,  0)  and  cr^  = = 1,  p = 0. 

The  latter  distribution  provides  the  noise  samples  for  the  simulation  illustrated  in 
Fig,  5,  with  a signal-to-noise  ratio  of  1.  Even  at  this  low  signal-to-noise  ratio,  com 
parison  of  Figs.  5 and  6 illustrate  the  immunity  of  the  Bivariate  m-Interval  Partition 
Classifier  to  Burst  Noise. 
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Fig.  6 Minimum  variance  scores 
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Extension  of  this  procedure  to  several  gray  levels  would  require  adjusting  of 
the  threshold  of  the  Bivariate  m- Interval  Partition  Statistic  and  a different  sequence 
of  hypothesis  testing  if  the  gray  levels  are  represented  by  different  magnitudes  of 
shift  of  mean  . 

Joint  Services  Technical  Advisory  Committee 
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A ROBUSTIZED  VECTOR  RECURSIVE  STABILIZER  ALGORITHM  FOR  IMAGE 
RESTORATION 

I.  Kadar  and  L.  Kurz 

The  iterative  method  of  object  reconstruction  (or  image  restoration)  is  reformu- 
lated in  a new  way  to  find  parameters  rather  than  functions.  With  a block  of  samples 
taken  from  an  iterative  measurement  equation  of  a time -varying  signal  in  additive 
noise,  the  block  sample  amplitudes  become  a vector  of  parameters  to  be  estimated  in 
the  presence  of  noise  at  each  iteration  step.  This  leads  directly  to  the  use  of  the 
multidimensional  extension  of  Gladyshev' s Minimum  Variance  Least-Squares 
Stochastic  Approximation  method  (SAMVLS)  and,  subsequently,  to  the  batch  pre- 
procested  Mann- Whitney- Wilcoxon  Nonparametric  Statistic  (MWWNS)  robustized 
vector  SAMVLS. 

The  rebustized  SAMVLS  algorithm  provides  immunity  to  measurement  noise 
outliers  in  unspecified  contaminated  noise  environments.  The  method  is  also  both 
computationally  efficient  and  requires  storing  of  only  the  last  block  of  batched  pre- 
processed  data  samples.  This  represents  substantial  savings  in  storage  requirements 
over  the  direct  application  of  the  unrobustized  Robbins -Monro  Stochastic  Approximation 
(RMSA)  method  which  requires  storage  of  eigenfunctions  and  has  no  immunity  to  noise 
outliers.  The  resultant  new  robust  algorithm  is  equivalent  to  a natural  stabilizer  of 
the  measurement-noise -caused  instabilities  of  the  iterative  method  without  the  need  of 
stopping  rule  constraints  on  the  recursion. 

A.  The  Parametrized  Iterative  Object  Reconstruction  Algorithm 

Consider  the  object  reconstruction  problem  from  a truncated  measured  portion, 
g(t)=f(t)p,j,(t)  for  which  the  iterative  algorithm  described  in  Ref.  1 becomes  (refer  to 
Figure  1) 

= [g^(t)  * k(t)]  p^(t)  + f(t)p,j,(t),  n = 1 . 2. . . . (1) 

where 

gj(t)  = f(t)  p,j.  (t)  : p^(t)  E l-p,j,(t) 

and  k(t)  = (sincrt)/irt 


It  is  clear  that 
8n+(<'^  = 


g(t),  |t|  < T 


t > T 


In  this  case  the  kernel  k(t)  represents  the  band -limitedness  of  f(t),  i.  e. 
fn(‘)  = g^.jd)*  k(t). 


«4 


(g(t)*k(t))  Po{f)*g(i).f 
. /^g(t)«k(t))p-(i) 


Fig.  1.  Plot  of  quantiles  bivariate  Cauch- 
independent  conriponents. 

The  known  signal  portion  g(t)  = f(t)p^(t)  in  Eq.  (1)  is  replaced  with 

(2) 

where  V^(t)  is  an  additive  noise  term,  i.  i.  d for  every  step  n of  the  iteration.  At 
this  point  no  further  assumptions  need  to  be  made  about  the  statistics  of  the  noise.  At 
each  step  of  the  iteration  starting  with  n = l,  a block  of  samples  is  taken  of  all  terms  in 
Equation  (1).  If  the  functions  under  considerations  are  band-limited,  then  the  number 
of  samples  needed  to  represent  the  functions  is  defined.  Otherwise,  one  has  to  assess 
the  significant  frequencies  of  interest  and  select  the  sample  size  to  make  the  aliasing 
error  vanishingly  small.  Some  of  these  considerations  are  mentioned  in  Reference  1 
and  2.  It  will  be  assumed  that  even  if  the  sampling  rate  is  greater  than  the  Nyquist 
rate,  the  cross-correlation  between  the  block  of  signal  and  noise  samples  is  small 
enough  to  be  negligible  and  the  samples  are  i.  i.  d. 

Now  let  n = 1 , 2,  3, . . . and  by  taking  block  samples  of  each  time  function,  one  can 

represent  Eq.  (1)  in  terms  of  "signal,"  S and  "noise"  W terms  as 

n — n 

®n“  = S o + W 
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where,  say.  at  some  step  n = k_>  1 . g,.  and  W are  m-vectors  corresponding  to  m 
samples  per  block  where  is  a vector  of  parameters  representing  the  amplitude  of 
the  time  varying  signal  S^  which  is  represented  by  a diagonal  (m  x m)  matrix  whose 
m diagonal  elements  are  the  block  samples  of  S^  , 


Sn  = diag  (S^j.  S22. 


S ) 
mm 


where  the  diagonal  entries  of  S^  are  obtained  from  the  block  samples  of 


(mxm) 


g^(j)K{i-j)}£^  + g. 


X,  = S,  , . X,  = S,, X = s 

In  11  2n  22  mn  mm 

The  sum  within  the  brackets  is  the  discrete  convolution  of,  say,  r^  samples  of  gj^(t) 
with  rj^  samples  of  the  kernel  such  that  the  total  number  of  samples 

m = r^  + rj^  - 1;  and  are  m-'dimensional  vectors. 

The  noise  term  is  given  by 


and  it  is  clear  from  Eqs.  (1)  and  (5)  by  the  discrete  convolution  operation  that  any 
noise  perturbation  is  spread  into  the  reconstructed  initially  unknown  signal  portion. 

It  is  important  to  note  that  the  dimension,  m,  grows  with  each  iteration  step  n due 
to  the  convolution  operation  which  is  performed  before  the  recusion  for  is 

iterated.  This  is  not  unexpected  since  the  algorith  is  extrapolating  the  known  signal 
segment. 

It  should  be  noted  at  this  point  that  the  convolution  form  of  the  iterative 
(restoration  or  extrapolation)  algorithm,  Eq.(l)  is  used  rather  than  the  FFT  imple- 
mentation suggested  by  Papoulis^  and  Youla.  ^ The  convolution  form  can  be  adapted 
directly  to  the  stochastic  approximation  framework  with  its  similarity  to  batch  pre- 
processing used  in  SAMVLS,  albeit  in  this  case  the  convolution  is  among  the  block 
(vector)  samples  and  does  not  require  an  initial  delay.  Furthermore,  from  a practical 
point  of  view  charge  coupled  devices  (CCD's)  used  as  analog  shift  registers  are  being 

developed  in  the  1977-78  time  frame  for  real-time  high  speed  convolution  applications 

4 

for  imaging  sensor  spaceborne  signal  processing.  The  convolution  operation  is  per- 
formed with  CCD's,  either  by  direct  storage  of  the  samples  and  shifting  operations. 
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or  by  using  CCD's  to  implement  the  FFT  algorithm  and  form  the  convolution  in  a two- 
step  operation.  There  is  no  published  information  available  at  this  time  on  the  relative 

performance  and  complexity  of  the  two  methods.  Both  theoretical  and  experimental 

...  4 

work  IS  in  progress. 

B.  The  SAMVLS  Stabilizer  Algorithm 

The  parametric  form  of  the  iterative  restoration  or  extrapolation  algorithm 

B = S a + W which  can  be  viewed  as  a measurement  equation  where,  the  noise 
^+1  n—  — n 

term.  W can  be  contaminated  by  outliers  directly  suggests  the  application  of  the 
vector  extension  of  Gladyshev' s theorem  to  Minimum  Variance  Least-Squares  (SAMVLS) 
to  estimate  the  time -varying  signal  parameter,  ^ at  each  step.  n. 

To  apply  SAMVLS  to  the  vector  parameter  estimation  problem,  one  considers 
each  component  of  the  measurement  separately  and  forms,  initially,  an  m-dimensional 
SAMVLS  (since  the  dimension,  m.  grows  with  each  iteration  step)  which  can  be  con- 
sidered as  m scalar  SAMVLS  algorithms  operating  in  parallel.  Specifically,  the 
vector  SAMVLS  for  Eq.  (1)  becomes 


i k+1  =^k  - k^k^^l- 


o [ X k'  — ^ ^ 


(6) 


where,  initially,  is  an  m-vector,  Aj^(  • ) is  a diagonal  (mxm)  adaptive  gain  matrix. 
Since  is  arbitrary  in  the  SAMVLS.  the  optimum  choice  in  this  case  is  to  let  it  equal 
the  amplitude  samples  of  the  known  signal  g(t)  for  ft  ( < T and  zero  for  sample  values 
|t  I > T,  making  up  an  (mxl)  block  sample  vector.  Actually,  the  iterative  measurement 
equation  (algorithm  within  the  SAMVLS  algorithm)  is  always  one  step  behind  as  the  pa- 
rameter vector  is  estimated  at  step  k and  the  inner  algorithm  updated.  Y(ft^,  a),(mxl), 
is  obtained  by  correlating  the  known  signal  at  step  k = n with 


sJfik+1  -k^  - ^-k’  + z(ak)  • 

which  becomes  Y(^,ff)  = ' sj \«k  current  estimate  of  o. 

Substituting  Eq.  (6)  in  the  expression  for  Y(^j^,  a) 

X {^k*  .2.)  = Sk  t (a.  - ^k^  ®k  -k  J 

and  the  regression  function  is  given  by 

E Y (Sj^.  a)  = M(kk-  ^ =^k  ®k*-  ■ -k^ 


which  is  linear  and  has  a unique  root  0_  at  . 
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As  observed  previously  in  the  proof  of  the  scalar  SAMVLS  (see  another  item  by 
the  authors  elsewhere  in  the  report),  the  regression  function  a^)  ‘ 

Sj^^jj]}  actually  a statement  of  the  multidimensional  orthogonality  principle,  with 

being  the  "error  term"  and  Sj^  the  data.  It  is  well  known  that  the  error 
is  orthogonal  to  the  data^  and  E{S^[g_  - S^]  } = 0_  and  ^ minimizes  the  mean-square 
error  E|lg_-  . This  can  be  interpreted  geometrically  in  terms  of  orthogonal 

projections,  very  much  alike  to  the  method  of  Youla.  In  reference  to  Fig.  2.  consider 
the  Hilbert  space  setting  of  Youla  and  the  orthogonality  principle  is  represented  by 
(DE)J_(OE)  since  (OD)  is  equivalent  to 


Reflecting  at  this  point  to  the  conditions  required  by  Theorem  1 of  Ref.  6,  the 
linear  regression  function  satisfies  conditions  (iv)  - (vi)  with  M(^,  a)  = 
where  = Sj^S^  is  a positive  definite  diagonal  mxm  matrix  for  each  k,  s.  t. 

((Bj^((  < 00  since  one  can  reasonable  assume  that  the  recursion  is  terminated  after  a 
finite  number  of  steps.  (With  both  Bj^  and  A^.  diagonal  for  each,  k,  P = I.  ) The 
additive  noise  term  Z (^  ) = Wj^  with  E Zj^(^)  = 0_  , must  have  a uniformly  bounded 
variance  and  have  a well-defined  covariance  matrix  as  a a.  s.  This  is  reflected  in 
condition  (vii,  a).  Sup  E || Z(^)  so  for  some  oO  which  is  satisfied  with  the 

Euclidean  norm,  Sup  {tr  [ S^  E (WVp” ) Sj^]^  ^ oo  for  each  k,  with  E ( W W^)  non- 

T 

negative  definite.  Condition  (vii,  b)  lim  E[Z(^)  Z (^)]  = tt  , where  ir  is  a non- 

negative  definite  matrix  and  where  the  limit  is  in  the  sense  of  the  norm,  is  satisfied 
with  lim  [ E (WW^)  Sj^]  = it.  Sj^  diagonal  for  each  k and  E(Ww'^)  is  a non- 

negative  definite  and  diagonal  by  the  i.  i.  d assumption  of  the  problem. 
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The  adaptive  gain  matrix  Aj^  j ■ • • • . ^ j^)  needs  to  be  well  behaved  and  a con- 
sistent "mean  square"  estimator  of  a constant  matrix.  With  Aj^(*)  diagonal  by  the 

(k)  (k)  <kl 

assumption  of  the  problem,  the  eigenvalues  of  Aj^  are  aj  ' > ' > 0 and 

the  eigenvalues  of  = diag  (Sj  j ^).  a diagonal  matrix,  are 

(k)  (k)  (k) 

— *^m  ^ •••>  a^  > 0 the  eigenvalues  of  A Condition 

(viii,  a)  is  satisfied 


m i i. 

0 < a'  inf  [a|^^  ]^}  < sup  { Yj  ^ } < a"  < oo 

x=l  1=1 

wpl  for  k large;  and  ^ —^1  where  A is  a constant  matrix  s.t. 

m ^ 

a^  < [ y a.^]^  < a"  . Condition  (ix)  a/  b -c  >\  and  a b - e > t for  each  k. 

— ■■  I-"—  ''Im  “ mm  ^ 

which  is  required  for  convergence  within  the  proof. 


Theorem  1;  Under  assumptions  (i)  - (ix)  stated  in  Theorem  of  Ref.  6.  which 
were  shown  to  be  satisfied  above  let  ri^^>  r^^^  > . . . > r > 0 be  eigenvalues  of 

.p  I 12m  ® 

AB.  B = Sj^Sj^  . Then  k^  (&  a)  is  asymptotically  normal  with  mean  zero  and 
covariance  matrix  Q,  where  Q is  a diagonal  matrix  whose  elements  are 
^if ’^ii^^^ii  ®ii  ‘ where  rr„  are  the  elements  of  rr  = S^E(WW^)  S a diagonal 
matrix,  i = 1,2,...  ,m(k).  For  proof  see  Reference  6. 


Comment 


It  is  clear  from  the  above  Theorem  and  from  the  form  of  Q,  that  the  asymptotic 
variance  is  a function  of  the  power  in  the  signal  samples  (which  are  the  eigenvalues  of 
B)  and  the  covariance  of  i.  i.  d noise  vector  samples  with  the  dimensionality  m = m(k) 
of  Q increasing  with  each  iteration  step.  The  optimum  gain  coefficient  is  given  by 
= l/(S?.),  which  minimizes  the  components  of  the  variance  and  assures  the  rate 
of  convergence  to  be  optimal.  One  has  to  be  careful,  however,  that  a.}  '<U/2  (S.  .),  since 

lx  ll  K 

the  variance  becomes  infinite  and  the  recursion  diverges.  To  avoid  the  instability  with 

(k)  2 * 

^ii  could  use  the  average  power  in  the  block  signal  samples  which 

would  rapidly  become  independent  of  k and  would  still  guarantee  near  optimum  con- 
vergence. 

However,  even  if  one  could  find  an  estimator  for  a„  = at  every  step  by 

somehow  measuring  the  energy  in  the  signal  samples  in  the  absence  of  noise,  the 

variance  is  a function  of  the  covariance  of  the  noise  E(Wj^Wj^  ),  and  the  variance  of 

the  recursion,  both  asymptotically  and  in  the  small  sample  case,  is  influenced  by  the 
. . . T 

variations  in  E(Wj^Wj^  ).  One  should  recall  at  this  point  that  the  elements  of  Sj^ 


are 
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-k  ~ 1/  gkO)K(i-j)}  Rg  + H.  and  the  additive  noise  term  is  given  by 


j=0 


1 

V 


To  alleviate  this  dependence,  on  e needs  to  robustize  the  SAMVLS  algorithm. 
However,  it  should  be  noted  that  even  in  the  above  unrobustized  case,  the  SAMVLS 
algorithm  reduces  the  mean  squared  error  e||^j^||^  = 0(l/k)  due  to  the  noise  in  the 
data,  while  the  iterative  measurement  equation  (algorithm  within  the  SAMVLS 
algorithm)  converges  in  mean-square  as  shown  by  Papoulis.  ^ Thus,  the  reduction  of 
the  noise  contribution  at  each  iteration  step  eliminates  the  instability  associated  with 
the  iterative  measurement  algorithm,  as  long  as  the  rate  of  convergence  of  the  SAMVLS 
compensates  for  the  reduction  of  eigenvalues  of  the  prolate  speriodal  functions  which 
reduce  at  a rate  depending  on  the  time -bandwidth  product,  To"  with  each  iteration  step. 

C.  Batch-Nonlinear -Integer  Rank  Transformation  Robustized  SAMVLS  Image 
Restoration  Algorithm 

To  robustize,  Eq.  (6),  one  introduces  batch  pre-processing  and  non-parametric 
rank  statistic  of  the  form  (the  B-N-L  method  discussed  in  another  item  in  this  report) 


W^tRk.  £.)  =^2  Zj  ®[^-tq(k-l)]S-[i+qk]'t®^®-5J[i+q(k-l)]^ 

where  (•  ) is  an  (m(k)xl)  vector  operator  applied  component  by  component.  W^(  ■ 

is  a symmetric  version  of  the  Mann- Whitney- Wilcoxon  Nonparametric  Statistics 
(MWWNS)  with  properties  summarized  here  for  convenience 

Efw'l]  = 0 

Var  W'l(  • ) = (2q  + l)/3q^ 

sup  = ing  = 1 

F.G  F.G 


(7) 


and  under  H and  K 


lim  P 
N-oo 


W^  - 

.^Var  W'l  * 


< t 


= «(t) 


with  asymptotic  normality  reached  with  as  few  as  q = 8 samples.  Furthermore,  the 
above  properties  do  not  require  symmetry  of  the  CDF,  either  under  H or  K. 


th>  »4 
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The  robust  vector  SAMVLS  in  this  case  is  of  the  form 


^k+1 


iw  - iw)  w‘’Y(a  . 


which  has  to  satisfy  the  conditions  of  Theorem  1.  It  should  be  noted  here  that  the  batch 
preprocessing  requires  an  initial  delay  of  q samples  during  which  the  algorithm  is 
iterated  and  the  samples  are  stored.  This  means  that  during  this  period  the  SAMVLS 
operates  essentially  as  an  unrobustized  algorithm  and  no  protection  is  provided  against 
measurement  noise  outliers,  causing  possible  instabilities  in  the  "inner"  measurement 
algorithm.  Proceeding  in  a manner  similar  to  the  scalar  case  in  Ref.  7,  it  can  be 

shown  that  for  each  m(k)  component,  say.  i =1,2 m(k)  with  M(^,  a^)  = 

E a)  ] = B(^-a)  the  diagonal  elements  of  the  B-matrix  (the  slope  of  the 


regression  function)  B = diag  (bj^ 


, b,  . in  this  case  are  b,  , = f ( 

k,m(k)  ki 


I =1,2,...  ,m(k)  where,  Uj^^  and  v are  the  batched  components  of  the  first  and 
second  terms  in  Y (^,  a),  Eq.  (7),  respectively,  and  f . (0)  is  a one- 

th 

dimensional  density  corresponding  to  the  I component  of  Y(^,  a).  It  is  clear  from 

the  previous  definition  of  the  terms  above  that  f^  ^ (0)  is  a function  of  the  block  signal 

samples  which  are  time -varying  from  block  to  block.  This  means  that  the  optimum 

gain  coefficient,  Aj^  = diag  (a^j^.  a . , • • • • a^nlk)  k^  f^u  -v  ^ is  also 

^ ” ’'k  ^ki  ki 

time-varying.  By  assuming  that  the  block  signal  sample  amplitudes  can  be  approxi- 
mated by  an  averaged  signal  level  between  adjacent  blocks,  Sj^  = S = 
diag  (avg  | S,  | , . . . , avg  |s  J ).  f (0)  becomes  a constant  for  each  k.  Now  a 
simple  approximation  cam  be  derived  by  representing  i=1.2, ...  ,m(k)  as  a 

generalized  Gaussian  noise  pdf  and  for  a wide  class  of  both  thin  and  heavy  tailed  pdf's 

f (0)-  only  varies  in  the  range  of  2 to  1.  This  means  that  one  does  not  need  a very 
i i 

precise  estimator  of  a^  at  each  step,  k.  since  the  efficiency  of  the  SAMVLS  is  not 

very  sensitive  to  changes  in  a.  . However,  the  estimator  of  a.  should  be  robust  if 

^k  ^k 

one  deisres  high  efficiency  independent  of  the  CDF  of  the  measurement  noise.  Such  an 
estimator  is  given  in  Ref.  6,  which  for  each  component  of  Aj^(  • ) 


k-1 

®ik  “ 4(k-l)^^  [(q/2)  + l]  '^j.[q/2]J 


th 

where  fq/g+l]”  order  statistic  form  batched  component  Uj^^  of  Y(^.  a_) 

th 

and  Zj  [q/2]“  order  statistic  from  Uj^^  . I = 1,2 m(k),  and  [e]  defined 

to  be  the  greatest  integer  less  than  or  equal  to  e.  It  has  been  shown  in  Ref.  8 that  the 
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r 


! 

above  robust  adaptive  estimator  of  the  optimum  gain  coefficient  satisfies  the  conditions 
of  Theorem  of  Reference  6. 

1 

The  covariance  matrix  of  the  asymptotically  normal  robustizfed  estimator  k*(^.£) 

2 2 2 T 

with  mean  zero  is  given  by  a..  Tr..[2a..S..  -1],  where  it.,  scr  [I]S  S, 

2 2 

i = 1,2 m(k),  and  o-  = (2q  + l)/3q  which  is  independent  of  the  measurement 

ii 

noise  statistics  (compare  this  result  with  the  covariance  of  the  unrobustized  SAMVLS) 
and  the  a^^  are  given  by  the  batched  order  statistic  estimator,  defined  previously.  In 
this  case,  the  robustness  is  reached  after  a small  number  of  iterations  and  the  in- 
stability of  the  recursion  is  only  reached  as  a limit  point  as  k-*  oo  . 

D.  Simulation  Results 

The  SAMVLS  method  was  applied  to  a signal  f(t)  = sin  (rt/irt  choosing  for  T the 

r 

; value  jT  / So-  in  the  presence  of  noise  contamination  described  by  the  mixture  dis- 

f 

f tribution  model  f(v)  = 0.  9n(0,  1)  -f  0.  ln(0.  8).  The  above  example  is  the  same  as  the  one 

j used  in  Ref.  1 which  is  used  here  to  illustrate  the  performance  of  the  "inner" 

algorithm,  Eq.  (1),  in  the  absence  of  noise.  This  allowed  comparing  and  checking  the 
! results  with  the  one  in  Ref.  1 for  the  noise -free  case  using  the  convolution  approach, 

j The  result  of  the  simulation  of  the  noise -free  "inner"  algorithm  by  the  convolution 

1 approach  is  shown  in  Figure  3.  The  effect  of  noise  contamination  on  the  convolution 

, implementation  of  the  "inner"  algorithm  is  shown  in  the  same  figure  indicating  di- 

j vergence.  The  stable  performance  of  a SAMVLS  algorithm  in  the  presence  of  noise 

^ contamination  is  also  shown  in  Figure  3. 

I The  above  simulations  were  generated  using  single  precision  matrix  routines. 

This  created  a little  computational  noise  in  SAMVLS.  Therefore,  in  Fig.  3 a best 
[ mean  squared  fit  is  shown.  If  double  precision  matrix  routines  are  used,  the  com- 

[ putational  noise  for  all  practical  purposes  disappears. 

k The  robustized  version  of  SAMVLS  was  not  used  because  of  the  unavailability  of 

( appropriate  computer  facility  and  supporting  software  (advanced  IBM370  systems). 

[ The  simulations  were  performed  on  the  IBM360/67  Time  Sharing  computer  with 

[ CPU-time  constraints  and  slow  computational  speed. 
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Fig.  3.  Plot  of  Type  I error  and  power  vs  threshold 

for  the  locally  most  powerful  scores  N = 500  iterations. 
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ESTIMATION  OF  PROBABILITY  DENSITY  FUNCTIONS  VIA  STOCHASTIC 
APPROXIMATION 

P.  Kersten  and  L.  Kurz 

Tsypkin^  considers  the  use  of  Robbins-Monro  (R-M)  stochastic  approximation  pro- 
cedure to  estimate  an  unknown  probability  density  function.  To  accomplish  this,  he 
approximates  the  density  function  f(x)  by  a finite  linear  combination  of  orthonormal  func- 
tions, i.  e. , 


N 


f<*)  = s 


i=l 


C.  <t).(x) 
1 


where 

J<}>.(x)<t>^(x)dx  = 6.^  =<<!).,  (t>>  , 

The  performance  measure  he  minimizes  is  the  integral  square  error  (ISE) 


1=  / [f(x)  - ^ C.  <t).(x)]^dx 
i=l 

2 

with  respect  to  C^,  i = 1,2,  ••  • ,N.  Kashyap  and  Blaydon  extended  this  to  general 
nonorthogonal  basis  functions.  In  this  report,  a set  of  basis  functions  is  introduced  to 
insure  ease  of  implementation  and  robustness  of  the  algorithm.  It  was  pointed  out  pre- 

3 

viously  by  the  authors  that  the  estimation  of  the  quantiles  was  an  essential  ingredient 

4 

for  the  implementation  of  the  m -interval  detector.  Ching  and  Kurz  also  remark  that 
the  locally  most  powerful  scores  for  this  detector  require  knowledge  of  the  p.d.f.  at  the 
quantiles.  They  provide  one  procedure  for  obtaining  these  values  at  the  same  time  that 
the  quantiles  are  being  estimated. 

A.  Choice  of  the  Basis  Functions 


Though  one  can  choose  basis  functions  with  infinite  support,  this  would  reqtiire 
evaluation  of  each  of  the  basis  functions  at  each  of  the  sample  points  W^.  An  alternative 
which  is  computationally  more  attractive  is  to  use  a set  of  nonorthogonal  continuous 
basis  functions  with  finite  support.  As  an  example,  consider  the  following  functions, 
which  are  merely  the  square  root  of  a translated  Beta  p.d.f. 


<t>(w)  = 


[(Za+  l)!/a!  aj]  ^^^(w  + 


/ 1 1 X 

WE  a even 


othe rwise 


Here  one  may  construct  the  basis  functions  from  translations  of  a single  function  4)(.) 
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and  vary  their  interaction  merely  by  choosing  the  size  of  the  translation  appropriately. 
In  general,  suppose  <t)^(x)  overlaps  for  je{i  - k,  • • • , i,  • • • ,i  + k).  If  one  now  mini- 

mizes the  ISE  by  taking  partial  derivatives  with  respect  to  Cy  one  obtains  N coupled 
linear  equations 


EC<t).(w)]=  T C.  <4). ,«!).>  , i=  1,.--,N,  C.  = 0 if  j<l  or  j>N 

j=i-k  ^ ^ ■’ 

In  matrix  form 

E[*(x)]  = DC 

where  D is  a band  matrix  with  2k  + 1 nonzero  diagonals  given  by  d^  = <<|>^,<|>>  and  C the 
vector  of  C^.  The  resulting  vector  recursion  equation  can  be  written  as 

X . , = - a„[X„  - D'^  *(W  )] 

— n+ 1 — n n n n 


— n+1  — n n — n n 

where  W are  i.i.d.  random  samples  from  the  distribution.  Since  by  our  choice  of  basis 
n » 

functions  that  we  specify  later,  D'  exists,  the  first  version  of  the  recursion  relation- 
ship is  chosen  so  that  the  random  gain  matrix  reduces  to  the  identity  matrix  (see  Refer- 
ence 3).  In  the  notation  of  Theorem  1 of  Ref.  3,  A=B  = I=P.  Moreover,  Y(X^:  C)  = 

X^  - D”*  *(W^),  M(X,C)  = X “5  accordingly  is  C - D ^ 

Note  that  for  any  sample  point  only  2k+l  of  the  N basis  functions  are  nonzero  and 

need  to  be  evaluated.  Moreover,  9 [M(X,  C)].  | 9X.  = 6..,  so  that  the  gain  matrix  is 

J J J 

"matched"  for  the  optimal  rate  of  convergence  of  each  component  of  the  vector  R-M 
procedure . 

B.  Results  of  Simulations 

Figures  1 and  2 contain  the  results  of  two  simulations  of  the  estimation  procedure 
suggested  in  Section  A.  Here  N=  13,  a=6,  k=2,  and 


<t>.(w)  = 


1(1.2012)‘/^x  10^(w  -i/3+  17/6)^ 
X (i/3  - 11/6  -w)^ 


, i/3  - 17/6  < w<  i/3  - ll/6 


otherwise 


Simulations  were  run  for  both  the  normal  density  with  mean  zero  and  variance  l/2  and 


Fig.  2.  Probability  density  function  estimation  via  the 
vector  Robbins -Monro  procedure  double  expo- 
nential distribution,  sample  size  1500,  f(x)  = 
exp(-  lxl/I)/^/T. 
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the  Laplace  density  f(x)  = exp(- lxl/\)/2\,  X = I/7T.  The  method  requires  no  storage 
of  samples  and  is  particularly  suitable  for  high  sample  rates  and  real-time  computation. 
The  choice  of  the  Beta  density  functions  provides  great  flexibility  in  choosing  the  shape 
of  the  expansion  function  to  fit  the  density  function  should  any  a priori  knowledge  of  its 
shape  be  available.  Since  the  support  of  max  <t>-(x)  is  finite,  this  procedure  can  be  adapt- 
ed to  censor  burst  noise  assuming  values  -B  with  probability  l/2  with  iB  both  outside 

the  support  of  max  <t>.(x). 

i ' 
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FIXED  POINTS  OF  RUNNING  MEDIANS  AND  LOCALLY  MONOTONIC  REGRESSION 
S-G.  Tyan 

A.  Introduction 

The  use  of  running  medians  and  their  combinations  as  nonlinear  smoothers  was 
1 2 

introduced  by  Tukey.  * At  first  they  were  only  recognized  as  good  resistant/ robust 
smoothers  against  spiky  noise  or  wildly  erratic  data.  Then  Rabiner,  et  al.  ^ dis- 
covered that  these  smoothers  could  also  preserve  and  follow  the  jump-type  discon- 
tinuities in  the  signal  sequences,  and  used  them  extensively  in  speech  processing. 
These  two  properties  have  made  running  medians  very  desirable  in  areas  such  as 
digital  image  processing,  ^ exploratory  data  analysis^  and  speech  processing. 

The  reason  that  these  median-based  smoothers  seem  to  be  so  promising  in  engineering 
application  is  because  they  can  serve  two  purposes  at  the  same  time:  suppressing  the 
spiky  noise  on  the  one  hand  and  preserving  the  jump-type  discontinuities  in  the  signal 
on  the  other. 


In  this  report  running  medians  are  treated  as  general  nonlinear  operators  and 
their  fixed  points  are  studied.  The  fixed  points  are  then  used  as  indicators  of  the 
effect  of  running  medians  on  general  data  sequences.  Finally,  a few  monotonic 
regression  examples  are  given  to  justify  the  use  of  compound  smoothers.  All  the 
proofs  of  the  theorems  given  in  this  report  are  omitted,  however,  they  can  be  found  in 
a paper  by  the  author. 

B.  Fixed  Points  of  Running  Medians 

For  an  infinite  sequence  x^,  -ao<.  n <.  oo,  its  running  median  of  odd  length  2k  + 1 
is  defined  as  the  sequence  y^,  -«<  n <.  oo,  where  y^  is  the  median  of  x. , n-k<i<n+k. 
For  a finite  sequence,  several  definitions  have  been  used  for  the  end  points.  ^ How- 
ever, for  convenience,  we  shall  use  the  following: 


Definition:  Let  x^,  1 <^n  <^N,  be  a finite  sequence.  Then  its  running  median  of  length 
2k  -t  1 is  defined  as 


1 <;.  i <;  2n  - 1 


1 <.  n < k 


y = median  of  x.  , 
' n 1 


n-k  < i ^ n+k, 
2n-N  < 1<  N 


for  k+l<n^N-k 
N-k+1  < n < N 


For  a single-side  infinite  sequence  the  definition  of  the  running  median  is  clear.  From 
now  on  we  shall  use  RM{2k+l)  to  denote  running  median  of  length  2k+l  and  use 
RM{2k+l)  for  that  of  a sequence  . The  following  statements,  if  without 

further  specification,  hold  for  both  finite  and  infinite  sequences. 


1 
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Lemma  1;  The  running  medians  are  non-expansive  mappings,  i.  e.  , for  any  two 
sequences  {x^}  and  {y^}  we  have 


Here  we  have  used  the  supremum  of  the  absolute  difference  as  a measure  of  distance.  | 

Even  though  what  we  actually  want  to  understand  is  the  effects  which  running  i 

medians  have  on  an  arbitrary  sequence,  it  seems  worthwhile  to  study  those  sequences 
which  are  unaffected  by  a running  median,  namely,  its  fixed  points.  This  is  bt'^ause  | 

of  the  inherent  difficulty  with  nonlinear  operators.  On  the  other  hand,  fixed  points  j 

often  serve  as  limits  when  the  operator  is  used  repeatedly  and  therefore  can  be  used  to 
indicate  the  trend. 

First  it  is  easy  to  show  that  a monotonic  sequence  is  invariant  under  RM(2k+l)  j 

for  all  k.  In  fact,  it  can  be  shown  that  the  requirement  on  "monotonicity"  can  be  | 

much  weakened.  | 

Definition;  A sequence  is  locally  monotonic  of  lenitch  m (abbr.  LOMO(m))  i 

iff  {x  ,x  X . , } is  monotonic  for  each  n. 

n n+1  n+m-1  j 

Obviously,  a LOMO(m)  sequence  is  also  LOMO(p)  provided  p < m . Again,  assume 

that  is  LOMO(m)  and  if  the  segment  {x^,  ’^n+m-l^  increasing,  i.  e. , 

’"n  ^ ’‘n+m-l  segment  {x^^j x^^^}  is  decreasing,  i.  e. , x^^j>x^^^, 

then  it  is  apparent  that  ^n+1  “ ' ’ ' ~ ^n+m-1  ^ ^n+m*  following 

lemma: 

Lemma  2;  If  there  is  any  change  in  trend,  then  a LOMO(m)  sequence  must  stay  con- 
stant for  at  least  m-1  samples. 

We  shall  show  in  Theorem  1 that  local  monotonicity  is  enough  to  guarantee  j 

invariance  under  running  medians.  I 

Theorem  1;  If  a sequence  {x  } is  LOMO(m)  then  it  is  invariant  under  RM(2k+l)  for  all  \ 

’ n I 

k,  k < m-2.  j 

For  example,  a LOMO(4)  sequence  is  invariant  under  RM(5).  Even  though  LOMO(k+2) 

sequences  are  invariant  under  RM(2k+l),  usually,  they  do  not  exhaust  tie  set  of  fixed 

points  of  the  RM(2k+l)  smoother.  It  has  been  found  that  these  fixed  points  belong  to  j 

two  basically  different  types.  Type  I consists  of  the  LOMO(k+2)  sequences  which  have  i 

been  mentioned. 


Theorem  2;  If  (x  } is  a fixed  point  of  RM(2k+l),  then  fcc^}  is  LOMO(k+2)  provided 
there  exists  a monotonic  segment  {Xp,  • • • • • length  k+1  . 
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The  above  theorem  says  that  if  a fixed  point  of  RM(2k+l)  is  smooth  enough  (i.  e.  , 
monotonic)  for  a segment  of  length  k+1 , then  it  is  smooth  over  the  whole  length  (i.  e.  . 
LOMO(k+2)). 

Definition;  A sequence  is  nowhere  LOMO(k),  if  it  does  not  contain  any  monotonic  seg- 
ment of  length  k. 

Theorem  3;  Let  be  a double-  side  infinite  sequence  and  let  it  be  a fixed  point  of 

RM{2k+l).  If  {x  } is  nowhere  LOMO(k+l),  then  {x  } is  a duo-valued  sequence,  i.  e. , 
n Ti 

x^  can  take  on  only  two  values.  It  is  a Type  II  fixed  point. 

Example.  We  will  always  use  0 and  1 to  indicate  the  two  values.  For  RM(5),  the 
only  Type  II  double -side  infinite  sequence  is...,  0,  1,0,1,...  For  RM(7),  we  have 
also  only  one  double -side  infinite  Type  11  sequence  which  is  ..., 0,1, 1,0, 1,0,0, 1,0, 

1,  1, 0,  1 , 1 , 0,  0,  1 Both  are  periodic.  We  have  some  algorithms  to  generate 

these  Type  II  sequences  for  RM(2k+l),  however,  they  do  not  exhaust  the  case  for  k>4. 

It  is  not  even  known  -wdiether  they  are  all  periodic  or  not.  Since  Type  II  sequences 
tend  to  fluctuate  faster  than  Type  I sequences  of  RM(2k+l),  they  should  be  considered 
rather  undesirable  in  a data  smoothing  problem.  If  this  is  true,  then  they  should  be 
suppressed  by  all  means.  Be  the  same  reason,  the  algorithms  used  to  generate  them 
are  not  presented  here. 

Theorem  3 does  not  hold  for  single-side  infinite  sequences  or  finite  sequences 
because  of  the  definition  of  running  medians  at  the  boundaries.  However,  only  a slight 
modification  is  necessary. 

Theorem  4;  Let  x , 0 < n < N,  be  a fixed  point  of  RM(2k+l),  where  N > 2k  . If  {x  } 

* — n n 

is  nowhere  LOMO(k+l),  then  it  is  a duo-valued  sequence,  say,  taking  values  from 
{0,1},  except  possibly  at  the  two  ends,  where  {x^}  gets  away  from  {O,  1}  monotonically 
as  n approaches  toward  the  two  ends  0 and  N. 

For  example,  the  sequence  {xg,  Xj,0,0,0,l,0,0,l,l,l,Xjj}  where  Xg  >Xj  > 1 
and  Xj  j < 0,  is  a fixed  point  of  RM(ll).  Note  that  x^  is  either  0 or  1 in  the  central 
portion  of  the  sequence  and  that  x moves  away  monotonically  from  {O,  1}  as 
n ^ 0 or  n f N=ll,  consequently  we  always  have  x^(x^-l)  ^0.  Since  running  medians 
are  order  statistics,  it  is  obvious  that  the  following  lemma  is  true. 

Lemma  3;  If  {x^}  is  a fixed  point  of  RM(2k+l)  and  if  the  mapping  g(*  ) is  monotonic, 

then  {g(x  )}  is  also  a fixed  point  of  RM(2k+l). 
n 

With  the  above  Lemma  it  is  easily  seen  that,  by  setting  g(x)  = 1 for  x > 1 and 
g(x)  = 0 for  X < 1,  the  sequence  {g(x^)}  is  a duo-valued  Type  U fixed  point  if  {x^}  is 
a Type  II  finite  sequence.  On  the  other  hand,  we  have  the  following: 
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Corollary  1;  Let  {x^}  ■ 0 < n £N,  be  a Type  II  fixed  point  of  RM(2k+l)  taking  values 
from  {O,  1 }.  Then  {y^}  . 0 < n ^ N is  also  a Type  U fixed  point  of  RM(2k+l ),  where 
{y^}  is  constructed  according  to  the  following  rules; 

If  X.  = 1 (0)  for  0 < i ^ Oj , then  y^  > 1 i<  0)  and  are  decreasing  (increasing  for 

0 < i < Oj  . 

If  X.  = 1(0)  for  n,  < i <•  N,  then  y.  > i (<  0)  and  are  increasing  (decreasing  for 
< i ^ N • 

By  Corollary  1,  it  is  clear  that  all  the  Type  II  fixed  points  of  finite  length  are  essentially 
duo-valued  sequences  except  with  slight  flexibility  at  the  two  ends.  They  can  be  con- 
structed from  duo-valued  sequences  immediately  by  using  the  method  given  in  the 
Corollary. 

In  case  {x  } is  a single-side  infinite  sequence,  the  Theorem  4 and  its  corollary 
n 

still  hold  with  two  ends  replaced  with  one  end. 

If  the  Typo  11  sequences  should  be  bypassed,  then  we  have  the  converse  to 
Theorem  1. 

Theorem  5;  (The  converse  to  Theorem  1.  ) If  {x^}  is  invariant  under  RM(2p+l)  for  all 
p = 1 , 2 k,  then  {x  } is  LOMO(k+2). 

Proof;  By  Theorem  2 and  the  fact  that  any  sequence  is  BOMO(2).  a fixed  point  of 
RM(3).  i.e.  , k=l  , must  be  LOMO(3).  Suppose  the  theorem  holds  for  k-1.  Then  {x^} 
which  is  invariant  under  RM(2p+l)  for  all  p = l.  2.  ...,  k must  be  LOMO(k+l),  there- 
fore, each  segment  of  length  k+1  is  monotonic.  By  Theorem  2,  is  LOMO(k+2).B 

Remark;  If  a sequence  {x  } , 0 <n  ^b.  where  b may  be  infinite,  is  invariant  under 
RM(5),  then  Xj  = median  {xq.  x^,  x^}  which  implies  that  {xq,  Xj,  X2}  is  monotonic. 

By  Theorem  2 the  sequence  must  be  LOMO(4).  Therefore  Type  II  sequences  do  not 
exist  for  this  case. 

Rabiner,  et  al.  ^ observed,  for  a special  case,  the  possible  relation  between  k 

g 

and  m in  Theorem  1.  It  has  recently  come  to  our  notice  that  Velleman  also 
described  correctly  the  relation  between  k and  m.  He  also  made  the  observation  that 
RM(2k+l)  tends  to  create  flat  tops  (or  bottoms)  of  length  k+1,  which  is  the  very 
characteristic  of  LOMO(k+2)  sequences.  (See  Lemma  2. ) 

Locally  monotonic  sequences  have  a certain  kind  of  smoothness  in  terms  of 
monotonicity;  they  do  not  allow  any  change  in  trend  within  a segment  of  consecutive  m 
samples  for  LOMO(m)  sequences.  This  excludes  the  possibility  of  isolated  impulses 
or  a burst  of  them  with  duration  less  than  or  equal  to  m-2.  On  the  other  hand,  jump- 
type  discontinuities  are  allowed  without  regard  to  the  magnitude  of  the  jump.  Of 
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course,  not  all  the  properties  of  LOMO  sequences  are  desired.  For  example,  the  flats 

2 8 

which  are  necessary  for  any  change  in  trend  can  be  an  eyesore  to  many  viewers.  * 

On  the  contrary,  the  Type  II  sequences  seem  to  be  totally  unwanted  in  a data  smoothing 
problem.  Therefore,  a good  smoother  built  upon  running  medians  must  not  have  fixed 
points  of  this  type.  In  fact,  for  a smoother  T,  we  should  not  only  consider  all  the 
fixed  points  of  T but  also  those  of  T*'  for  all  natural  numbers  n.  This  is  because 
a sequence,  even  though  not  a fixed  point  of  T,  may  be  a recurrent  sequence  which  is 
just  as  bad  as  the  Type  II  sequences.  A typical  example  is  the  alternating  sequence 
...,0,  1,0,1,...  which  is  also  the  only  Type  II  sequence  of  RM(5).  The  sequence  is 
not  a fixed  point  of  RM(3),  nevertheless,  it  is  a fixed  point  of  RM(3,  3).  Here  we  use 
RM(p.q,  r)  to  denote  the  operation  ci  successively  applying  RM(p),  RM(q)  and  RM(r)  to 
an  arbitrary  sequence  in  this  order.  We  shall  also  use  RM(p*  £)  to  indicate  repeating 
RM(p)  for  f times. 

Since  RM(3)  and  RM(5)  are  the  most  frequently  used  running  medians,  we  shall 
concentrate  on  problems  related  to  them  exclusively.  From  Theorem  4 and  its 
Corollary  one  can  see  that,  even  though  we  have  used  possibly  the  simplest  definition 
i.,--  the  running  medians  at  the  boundaries  of  the  finite  sequence,  the  boundary  still 
poses  unnecessary  complications  without  yielding  much  additional  information.  There- 
fore in  the  next  section  only  double-side  Infinite  sequences  will  be  considered. 


C«  Some  Compound  Smoothers  Related  to  RM(3) 

A class  of  nonlinear  smoothers  which  also  includes  RM(3)  can  be  characterized 
via  the  following  properties.  We  let  where  T is  the  operator  under  con- 

sideration and  let  y = median  {x  .,  x , x 

n n-i  n n+i 

(i)  (z  -X  )(z  -y  ) < 0,  (i.  e.  , z is  between  x and  y ). 

' ' ' n n'  n ^n'  — * ' n n 'n' 

(ii)  Y <z  <x  ory>z>x  ifx^y 

'''n—  n n ^*^“11  n 


'n  — n n 
A stronger  condition  is 


(iii)  (z  -X  )(z  -y  ) < 0 if  x y 
' ' n n n 'n  n ' 'n 


Example!  RM(3)  satisfies  (i)  and  (ii)  but  not  (iii).  However,  l/2[l+  RM(3)]  , where 
I is  the  identity,  satisfies  (i)  and  (iii). 

Lemma  4:  Suppose  T satisfies  (i).  If  {x_  , , x , x . , } is  monotonic,  then  z = x 
~ ~ n-i  n n+1  . n n 

and  {z  z , z is  monotonic  of  the  same  trend. 

n-1'  n n+L 

Obviously,  LOMO(3)  sequences  are  fixed  points  of  any 
Property  (i). 


T which  satisfies 
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Lemma  5;  Let  0 < o < I and  let  Tj*T2  be  the  compound  smoother 

i 

(a)  If  Tj  and  satisfies  (i)  then  oTj  +(l-a)T2  and  Tj*T2  both  satisfy  (i). 

(b)  If  satisfies  (i)  and  (ii)  and  if  satisfies  (i),  then  aT^ (1 -a)T2  satisfies 

(ii) . 

(c)  If  Tj  satisfies  (i)  and  (iii)  and  if  satisfies  (i).  then  oTj  +(l-o)T2 

satisfies  (iii)  and  T ^ satisfies  (ii).  Furthermore,  if  T! ^ also  satisfies 

(iii) ,  then  Tj*T2  satisfies  (iii).  ! 

Remark;  Even  if  both  Tj  and  satisfies  (i)  and  (ii),  Tj*T2  may  not  satisfy  (ii).  | 

Theorem  6.  If  T satisfies  (i)  and  (ii)  and  if  {x  } is  a fixed  point  of  T,  then  {x  } i 

— n * n 

is  a LOMO(3)  sequence.  | 

Theorem  7;  Let  T = T.*T-  . Suppose  T,  satisfies  (i)  and  (ii)  and  T^  satisfies  (i).  i 

If  ttjj}  is  a fixed  point  of  T then  {x^}  is  either  a LOMO(3)  sequence  or  an  alternating  > 

sequence  (i.  e.  , ....  0, 1 . 0,  1 , . . . ).  If  Tj  satisfies  (i)  and  (iii)  instead,  then  {x^}  is  • 

a fixed  point  iff  it  is  LOMO(3).  The  above  results  hold  for  T = T2*Tj.  I 

Remark;  Since  RM(3)  satisfies  (i)  and  (ii),  by  Theorem  6,  it  has  LOMO(3)  sequences 
as  its  only  fixed  points.  However,  a direct  application  of  Theorem  7 shows  that 
RM(3*f ) has  both  LOMO(3)  and  alternating  sequences  as  its  fixed  points  if  i is  even. 

For  i odd,  it  has  only  LOMO(3)  fixed  points.  Consequently,  the  alternating  sequence 
is  the  only  recurrent  sequence  of  RM(3'. 

Since  an  alternating  sequence  is  usually  considered  rather  rought,  thus  it  is 
unwise  to  use  RM(3)  or  RM(3*f)  as  the  only  smoother  in  smoothing.  In  the  following 
we  consider  another  kind  of  arrangement  which  can  forestall  this  unwanted  sequence. 

Lemma  6;  Let  0 < a < 1 and  let  Tj  and  T2  satisfy  (i).  Then  {x^}  is  a fixed  point 
of  T = aTj  +(l-a)T2  iff  it  is  a fixed  point  of  Tj  and  T2  simultaneously. 

Combining  Lemma  6 and  the  remark  following  Theorem  7 we  have 
, K 

Theorem  8;  Let  ^ 1,  aj^^O,  and  aj^  = 1.  Then  the  smoother 
K 

T = y a.  RM(3*k) 
k=0  ' 

has  fixed  points  LOMO(3)  sequences  only  provided  for  some  odd  k,  0.  If  aj^  = 0 

for  all  odd  k,  then  T has  both  LOMO(3)  and  alternating  sequences  as  its  fixed  points. 


To  obtain  the  fixed  points  of  T^  we  have  only  to  rewrite  T^  as  follows: 
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J i -1 
T = T ** 


■ K 


aj^  RM(3f'k) 


= +ajT''‘*RM(3)  + 

K 

+ y a,  T^'‘*RM(3*(k-l))*RM(3). 

1^2 

If  {x^}  is  a fixed  point  of  T'^  , then  by  Lemma  6,  Theorem  7,  and  the  assumption  that 

a.  1 , it  must  be  a LOMO(3)  or  an  alternating  sequence.  We  can  also  check  that  the 

^ f 

alternating  sequence  cannot  be  invariant  under  T unless  (1)  aj^  = 0 for  all  even  k 

and  i is  even,  or  (2)  aj^  = 0 for  all  odd  k.  In  the  second  case,  the  alternating 

sequence  is  a fixed  point  of  T. 

In  order  to  avoid  the  alternating  sequence  either  as  a fixed  point  or  as  a re- 
current sequence,  it  seems  advisable  to  use  Tj  = al  + (l-a)RM(3)  or  = aRM(3)  + 
(l-o)RM(3*2)  = RM(3)*Tj^  in  the  place  of  a single  RM(3).  Indeed,  both  are  free  of  the 
alternating  sequence.  In  Section  5 we  shall  have  more  discussion  on  smoother  as 

well  as  another  smoother  with  a being  a function  of  {x^}  instead  of  being  a constant. 
The  above  family  of  nonlinear  smoothers  which  satisfy  property  (i)  can  be  viewed  as 
an  extension  of  the  RM(3)  smoother,  nevertheless,  they  differ  from  one  another  in 
smoothing  effect  or  power.  Unfortvinately , due  to  their  nonlinear  nature,  questions 
concerning  details  on  this  aspect  do  not  seem  readily  amenable  to  the  fixed  point  theory 
developed  above. 

We  conclude  this  section  with  some  miscellaneous  results. 

Lemma  7;  Let  {y  } = RM(3){x  }.  If  (x  , , x . x , , ) is  monotonic,  then 
■ — 'n  n n-i  n nTi 

<yn-2’  ^n-l*  ^n’  ^n+l  ’ W LOMO(3). 

An  immediate  consequence  of  Lemma  7 is 

Theorem  9:  If  there  exists  a monotonic  segment  (x  , x , x_  ) then  RM(3*l){x^} 

— — ' ° n-l  n n+i  n 

converges  pointwise  to  a LOMO(3)  sequence  a.8  t — oo.  If  {x^}  is  a finite  sequence, 

1 <.n<^N,  then  RM(3*f){x^}  is  LOMO(3)  for  some  t where  f < [N-l/2]  . 

Another  quick  result  is 

Theorem  10;  For  T = a + {l-a)RM(3),  0<  a<  1 , the  sequences  converge 

pointwise  to  a LOMO(3)  sequence  as  i -^loo. 

So  far  we  have  barely  mentioned  RM(5),  one  result  which  we  have  obtained  is 
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Theorem  11;  Let  T = aRM(3)  + (l-a)RM(5).  0 < a < 1.  Then  is  a fixed  point  iff 

it  is  LOMO(4). 

D,  Locally  Monotonic  Regression 

Assume  that  the  signal  sequence  s^.s^ of  length  N is  locally  mono- 
tonic of  length  m and  that  it  is  corrupted  by  an  error  sequence  e^.e^.  •••»  . wit 

e i.  i.d.  random  variables.  Let  x^  = s„  + e be  the  observed  sequence.  If  the 

nun 

error  is  Gaussian  of  zero  mean,  then  the  maximum  likelihood  estimate  of  s^, 

\<r  n^N.  is  the  sequence  which  minimizes 


n=l 


(x  -u  )‘ 
' n n 


(1) 


among  all  the  LOMO(m)  sequences.  The  special  case  where  m = N and  where  {u^} 
is  required  to  be  non-decreasing  is  generally  known  as  isotonic  regression  or  anti- 
tonic regression  if  is  non-increasing.  If  m = N and  if  {u^}  can  be  either  non- 

decreasing or  non-increasing , then  a proper  name  should  be  "monotonic  regression" 
as  suggested  by  Barlow,  et  al.  (see  Ref.  9.  p.  56).  Therefore  it  seems  appropriate 
to  call  the  solution  to  Eq.  (1)  as  the  "locally  monotonic  regression"  of  x^.  1 < n < N. 
At  first  we  were  tempted  to  think  that  the  regression  should  lie  in  between  the  upper 
and  lower  envelopes  of  {x^}  (with  modifications  at  the  endpoints).  However,  a simple 
counterexample  shows  that  it  is  wrong.  In  the  following,  a simpler  problem  is  solved 
and  is  used  to  justify  some  of  the  compound  smoothers  described  in  Section  C.  It  is 
straightforward  to  show  that 


Lemma  8;  Let(Sj.  s^,  § j)  be  the  monotonic  regression  of  (xj  , X2.  x^).  Then 

§2  = [*2  + median  (Xj . X2.  x^)]  / 2. 

Suppose  the  signal  sequence  is  LOMO(3)  and  is  coorupted  by  i.  i.  d.  Gaussian 

error  sequence  e^^.  Then  the  maximum  likelihood  estimate  of  s^  given  x^_  j . x^  and 

X is,  according  to  the  above, 
n+l 


8 = T X 

n 2 n 


+ y median  {x 


, . X . X . , } 

n-1  n n+l 


(2) 


{S  } = I + RM(3)]{x  } 

where  I is  the  identity  operator.  Equation  (3)  can  be  rewritten  as 
=RM(3){x^}  + y [ I-RM(3)]{x^} 
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which  has  a structure  similar  to  that  of  "twicing”  as  used  by  Tukey.  Here  the 

"rough"  [I  - RM(3)]  {x  } is  reduced  in  magnitude  by  half  and  is  then  added  back  to  the 

n y 

"smooth"  RM(3){x^}.  (See  Beaton  and  Tukey  for  the  definition  of  the  terms  used 
above. ) 

If  the  error  has  a density  function  c exp(-a|x|^)  then  the  solution  to  the  simplified 
regression  problem  is 


§ = 
n 


(x  + y )/2 
n ' n 

any  number  between  x and  y 

n 'n 


X or  y 
n 'n 


Y > 1 

Y = 1 
0 < v < 1 


where  y^  = median(x^_j,  x^,  Apparently,  the  solution  always  lies  in 

I = {x|  (x-x  )(x-y  )<  0}  . 
n ^ ' n ^n  — 

Another  case  is  the  one  where  the  error  has  Cauchy  density,  i.  e. , l/ir  (1  +x^). 
The  solution  is 


8 = ax  + (l-a)y 

n n ' ''n 


(4) 


where  a is  a function  of  (x  - y ) and 

n n 


1/2 

1/2 + [1/4  - 


X -y  |<  2 
n ^n 


X -y  > 2 
' n 'n ' 


(5) 


For  smoothing  purposes,  y^.  which  is  the  median,  shall  be  more  reliable  than  x^ 

when  the  difference  is  large  and  hence  should  be  given  more  weight.  Therefore  the 
negative  sign  is  chosen  in  case  ^ 2.  We  may  write  s^  by 

S = y + (x  -y  )a  = y + g{x  -y  ) 
n ’n  ' n 'n  'n  ° n 'n' 


g(x)  = 


where 

x/2  , |x|  < 2 

(l/2  - '>/l/4  - l/ x^)x  , |x|  >2 

The  odd  function  g(x)  is  linear  for  small  {x|  (i.  e.  , |x|  ^ 2)  and  it  decreases  in 

magnitude  to  zero  as  |x|  goes  beyond  that.  This  is  a property  which  coincides  well 

with  our  intuition,  especially,  if  the  noise  has  a Gaussian-like  center  and  stretched 

tails.  Again,  the  nonlinear  function  g(x)  plays  the  role  of  Tukey's  "smoothing  the 
2 

rough.  " The  design  of  g(x)  can  be  rather  arbitrary  to  meet  different  requirements. 


560 


IMAGE  PROCESSING 


The  principle  is  that  it  should  be  linear  around  the  origin  and  decrease  in  magnitude 
to  zero  as  |x|  goes  beyond  a certain  point. 

Since  the  smoother  T of  Eq.  (4)  satisfies  Property  (iii)  of  Section  C,  thus  it  has 
only  LOMO(3)  fixed  points.  However,  when  applied  to  a data  sequence,  it  usually  does 
not  generate  the  flats.  A final  observation  is  that  T converges  pointwise  to  a 

LOMO(3)  sequence  as  I -*  x . 

Joint  Services  Technical  Advisory  Committee 
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